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We prove a new version of the quantum threshold theorem that applies to concatenation 
of a quantum code that corrects only one error, and we use this theorem to derive a 
rigorous lower bound on the quantum accuracy threshold so. Our proof also applies 
to concatenation of higher-distance codes, and to noise models that allow faults to be 
correlated in space and in time. The proof uses new criteria for assessing the accuracy of 
fault-tolerant circuits, which are particularly conducive to the inductive analysis of recur- 
sive simulations. Our lower bound on the threshold, e > 2.73 X 10~ 5 for an adversarial 
independent stochastic noise model, is derived from a computer-assisted combinatorial 
analysis; it is the best lower bound that has been rigorously proven so far. 

1 Introduction 

Our hopes that large-scale quantum computers will be built and operated someday are founded on the theory 
of quantum fault tolerance [1] . A centerpiece of this theory is the quantum threshold theorem, which asserts 
that an arbitrarily long quantum computation can be executed with high reliability, provided that the noise 
afflicting the computer's hardware is weaker than a certain critical value, the accuracy threshold [2, 3, 4, 5, 6]. 
In this paper, we will present new proofs of this fundamental theorem, and new rigorous lower bounds on 
the accuracy threshold. 

There have been several very interesting recent developments concerning the accuracy threshold. The 
original threshold theorem applied to the standard quantum circuit model with nonlocal quantum gates, and 
for Markovian noise. New threshold theorems have been proved for non-Markovian noise [7], and for the 
cluster-state model of computation [8, 9], and numerical estimates of the threshold have been carried out 
for computation using local gates [10, 11]. Furthermore, for computation with nonlocal gates, significantly 
improved estimates of the threshold have been found [12, 13, 14]. These recent developments build on the 
foundations provided by the threshold theorem first proved by Aharonov and Ben-Or [2], by Kitaev [3], 
and by Knill, LaHamme, and Zurek [4]. Our purpose in this paper is to reexamine and strengthen these 
foundations. 



Our goal is to assess the reliability of a quantum computer whose gates perform imperfectly as described 
by some specified noise model. Though more general noise models can be analyzed (and will be, in Sec. 11 
of this paper), it is illuminating to consider the special case of independent stochastic faults. In this noise 
model, faults are independently and identically distributed at the locations of the noisy quantum circuit; 
that is, at each location either the gate is executed perfectly (with probability 1 — e), or a fault occurs (with 
probability e) — we say that e is the fault rate. (Faults can occur even if the ideal gate is the identity; the 
"resting" qubits are subject to storage errors.) Though the fault locations are chosen probabilistically, once 
the locations are chosen the action of the faulty gates can be chosen adversarially; that is, we allow the faults 
to be arbitrary trace-preserving quantum operations. 

The noisy gates are used to build a fault-tolerant simulation of an ideal quantum circuit. In this sim- 
ulation, the logical qubits processed by the computer are protected from damage using a quantum error- 
correcting code [15, 16], and the gates acting on the logical qubits are realized by "gadgets" that act on 
the code blocks. The gadgets exploit the redundancy of the quantum error-correcting code to diagnose and 
remove errors caused by faults; they are carefully designed to minimize propagation of errors among qubits 
within the same code block. 

The fault-tolerant simulations that we will analyze here are based on concatenated quantum codes [17]. 
The code block of a concatenated code is constructed as a hierarchy of codes within codes — the code block 
at level k of this hierarchy is built from logical qubits encoded at level k — 1 of the hierarchy. Likewise, 
our fault-tolerant gadgets are constructed as a hierarchy of gadgets within gadgets — the gadgets at level 
k are built from gate gadgets at level k — 1. The basic idea of the threshold theorem is very simple: if e is 
below the accuracy threshold Eq, then the level- 1 simulation of the ideal circuit will be more reliable than 
an unprotected "level-0" circuit. Because of the self-similarity of the gadgets, it then follows that the level-2 
simulation will be still more reliable, and so on. Thus for e < e , our recursive fault-tolerant simulation 
becomes arbitrarily reliable as we increase the level of concatenation. 

We remark that for the threshold theorem to apply, two features of the simulation are essential: First, 
quantum gates can be executed in parallel — otherwise we would be unable to control errors that occur 
simultaneously in different parts of the computer. Second, qubits can be discarded and replaced by fresh 
qubits — otherwise we would be unable to flush from the computer the entropy introduced by noise [18]. 

We will discuss two versions of the threshold theorem, each appropriate for a different type of concatenated 
coding scheme. One version applies if each code in the recursive hierarchy can correct two or more errors 
in the code block (it is a quantum code whose distance is at least 5). This is the scheme considered by 
Aharonov and Ben-Or [2] and by Kitaev [3]. The other version applies even if each code in the recursive 
hierarchy corrects only one error in the code block (it is a distance-3 quantum code). This is the scheme 
considered by Knill, Laflamme, and Zurek [4] and in much subsequent work. 

These two versions differ because the fault-tolerant gadgets have different properties depending on which 
coding scheme is used. For a distance-5 code, level-1 gadgets can be designed such that, if there is one fault 
in the gadget and no more than one error in each of its input blocks, then there is no more than one error 
in each of its output blocks. As was shown in [2, 3] using an inductive argument, a similar property can 
be established at each level of the recursive hierarchy, which suffices to prove the threshold theorem. We 
will present a new proof that uses the same concepts; this proof actually follows closely the proof due to 
Aharonov and Ben-Or [2], but we hope that some readers will find our proof especially clear and accessible. 

Distance-3 codes have a smaller block size than distance-5 codes with the same number of logical qubits; 
hence fault-tolerant gadgets with fewer gates can be constructed using distance-3 codes, and the threshold 
fault rate So is expected to be correspondingly larger. Therefore, a threshold theorem that applies to 
concatenated distance-3 codes is highly desirable. But in contrast to the distance-5 case, since the code can 
correct only one error, a level-1 gadget with one fault may fail if one of its input blocks has a single error; 
hence the threshold theorem formulated in [2, 3] does not apply. 
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A different (and on the face of it more complex) method of analysis is needed, in which the effectiveness 
of each gadget is predicated on the performance of the gadgets that immediately precede it in the circuit. 
Using such reasoning, Knill, Laflamme, and Zurek found a criterion for a level- 1 simulation to outperform 
a level-0 simulation, and asserted without proof that this criterion provides a lower bound on the accuracy 
threshold [4] . The main results of this paper are a proof of the threshold theorem for concatenated distance-3 
codes that justifies the criterion stated in [4], and a rigorous estimate of the accuracy threshold based on 
this theorem: £o > 2.73 x 10 . Our proof turns out to be remarkably simple, simpler in some ways than 
our proof for the distance-5 case. As far as we know, no proof of the threshold theorem for concatenated 
distance-3 codes has previously been published, for either quantum or classical computation. 

The threshold estimate in [4] is done by finding an upper bound on the failure probability of a level- 1 
gadget. Failure occurs only if two faults occur in an "extended gadget," so it suffices to count the pairs of 
locations in the extended gadget. Not only do our results put such estimates on a fully rigorous footing; we 
are also able to improve the threshold estimate, for two reasons. First, our gadgets are more efficient than 
those constructed in [4]. Second, many pairs of fault locations in the extended gadget are actually benign; 
they do not cause failure. Our new formulation of the threshold theorem allows us to do a more refined 
count of the malignant pairs of locations that actually can cause a gadget to fail. 

Another way to estimate the accuracy threshold, described in [5, 6, 19], is to derive and analyze a map (the 
concatenation flow equation) that expresses how an effective noise model evolves as the level of concatenation 
k increases. The accuracy threshold is an unstable fixed point (or fixed surface) of this map. This method 
applies to concatenated distance-3 codes, but turning such estimates into a rigorous theorem proves to be 
challenging for several reasons - for example, it is necessary to control the effects of error correlations that 
arise because code blocks interact multiple times as a circuit is executed. Two of the authors, whose work 
on this problem had long been stalled, were delighted to find a much simpler way to prove the threshold 
theorem for the distance-3 case (following in the footsteps of [4]) while also obtaining a rigorous threshold 
estimate that nearly matches the more heuristic result found by analyzing the concatenation flow equations. 

We will also present a proof of the quantum threshold theorem that applies to a local non-Markovian noise 
model, and which goes beyond the result found in [7] in two useful ways. First, we make weaker assumptions 
about the locality of the noise model than assumed in [7] ; second, our analysis applies to "extended gadgets" 
and therefore to concatenated distance-3 codes. 

Finally we remark that, while our estimate of the quantum accuracy threshold is the best that has so far 
been rigorously proven, recent studies by Knill have found numerical evidence for a much higher value of the 
threshold [13]. For the independent stochastic noise model, Knill calculates that for e < e' ~ 3 x 10 -2 , it is 
possible to simulate a universal set of gates with effective fault rate below the threshold So for a concatenated 
distance-3 code, thus establishing e' as an improved lower bound on the quantum accuracy threshold. The 
theorem proved in this paper, providing a lower bound on £q , furnishes a rigorous foundation for one essential 
part of Knill's analysis. An important goal for future work will be to make the rest of KnilPs higher threshold 
estimate fully rigorous. 

The rest of this paper is organized as follows. In Sec. 2 we explain the essential characteristics of fault- 
tolerant gadgets built from noisy gates, and relate the goodness (sparseness of faults) of a level-1 simulation 
to its correctness (i.e., accuracy). In Sec. 3 we define appropriate notions of goodness and correctness for 
level-fc simulations, and state our main lemma, which asserts that a good level-fc simulation is correct. In 
Sec. 4 we invoke the lemma to prove the quantum threshold theorem for independent stochastic noise, and 
in Sec. 5 we prove the lemma. In Sec. 6 we explain how to improve the threshold estimate by counting 
malignant pairs of fault locations. In Sec. 7 we discuss in more detail how fault-tolerant gadgets can be 
constructed, and in Sec. 8 we derive a rigorous lower bound on the quantum accuracy threshold by applying 
the result of Sec. 6 to gadgets based on a particular distance-3 code. In Sec. 9 we prove a threshold theorem 
that applies to concatenation of higher-distance codes, and in Sec. 10 we use a different method to prove a 



3 



threshold theorem whose content is similar to the result of [2]. In Sec. 11 we prove the threshold theorem 
for local non-Markovian noise, and Sec. 12 contains our conclusions. 

After our work was completed, Reichardt [20] , using different methods than ours, also proved a quantum 
threshold theorem for concatenated distance-3 codes. 

2 Fault-tolerant simulation at level 1 

Before proceeding to our proofs of the threshold theorem, we will describe the key properties of the gadgets 
used in our fault-tolerant simulations. To build intuition, we will at first discuss these properties rather 
informally; in Sec. 3, and again in Sec. 9, we will restate the properties in another language to formulate a 
precise statement of the threshold theorem. Explicit circuits for gadgets will be discussed in Sec. 7 and 8. 

Fault-tolerant simulations of quantum circuits are based on quantum error-correcting codes. Such codes 
can be constructed for d-dimensional quantum systems ("qudits"), but for definitcness we will imagine that 
our elementary quantum systems, and the encoded systems protected by the code, are qubits (d = 2). We 
will denote by n the length of the code, the number of qubits in the code block. We will focus in this paper 
on codes with one encoded qubit per block, though our constructions can be generalized to codes with more 
than one encoded qubit per block. 

Suppose that denotes a pure quantum state in the code space. The effect on \xp) of an arbitrary error 
(a trace-preserving completely positive map acting on the n qubits in the code block) can be expanded in 
terms of n-qubit "Pauli operators" : 

M-^KM®!^; (i) 

a 

here the states \o)e are states of the "environment", which are not assumed to be normalized or mutually 
orthogonal, and the E a 's are summed over the 2 2 " operators {/, X, Y, Z}® n , where {I,X,Y,Z} are the 
singlc-qubit Pauli operators 

We denote by w the weight of a Pauli operator, the number of qubits on which it acts nontrivially (with 
a Pauli operator other than the identity I). Most fault-tolerant protocols are founded on the hypothesis 
that highly correlated errors that damage many qubits at once should be rare — the norm || \o)e | of the 
state associated with a high-weight Pauli operator E a should be quite small. We say that a quantum error- 
correcting code can correct t errors (for some t < n) if the code protects against all Pauli errors with weight 
w < t. Though uncorrectable errors with weight higher than t might occur, one hopes that these higher 
weight errors are sufficiently suppressed that the protection against error is still reasonably effective. A code 
that corrects t errors is said to have distance 2t + 1, because (at least roughly speaking) Pauli operators that 
preserve the code space and act nontrivially within the code space have weight at least 2t + l [21]. Thus we 
will say "distance-3 code" to indicate a quantum code that can correct one error, and "distance-5 code" to 
indicate a quantum code that can correct two errors. In either case, error recovery proceeds in two steps. 
First all of the "check operators" in a mutually commuting set are measured. This measurement projects the 
error onto a particular Pauli operator (or a set of Pauli operators that all act on the code space in the same 
way); furthermore the measurement outcomes constitute an error "syndrome" that points to a particular 
Pauli operator E a (or to a particular set of equivalent Pauli operators). Then the unitary operator El is 
applied to reverse the damage due to the error. 

In a fault-tolerant simulation of an ideal quantum circuit, each gate in the ideal circuit is simulated by a 
gadget constructed from the noisy gates, which acts on the logical qubits that are protected by the code. In 
describing such simulations, we will make a distinction between an error, a position in a code block where 
a qubit has been damaged, and a fault, a location in a circuit where the operation applied by one of the 
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elementary gates deviates from the ideal operation. Fault-tolerant gadgets are designed to prevent an error 
introduced by a fault from propagating to become many errors within a single code block. 

Fault-tolerant simulations of quantum computations share many of the same features as fault-tolerant 
classical simulations, and can be analyzed using similar methods. But there are also some new issues that 
arise in the quantum case that have no classical analog. First, while in the classical case we need only be 
concerned about bit flip (X) errors, in the quantum case we need to worry about the propagation of phase 
flip (Z) errors as well. (A Y error can be viewed as an X error and a Z error simultaneously afflicting the 
same qubit.) Second, the construction of a universal set of fault-tolerant gates is especially challenging in 
the quantum case; the off-line preparation and verification of quantum software is required to implement 
some of the gates [22]. And third, the code states for a quantum code have the property that the qubits in 
the code block are highly entangled with one another, so that the density operator of each individual qubit 
is maximally mixed. It can be a subtle matter to speak of an error at a particular position in the block, 
because the error might have no locally observable effect on the damaged qubit; the damage affects only the 
quantum correlations of that qubit with other qubits. 

In the proof of the threshold theorem we will study the performance of a recursive simulation. At "level 
0" of this recursion, the gadget "simulating" an ideal gate is just a noisy gate; we call it a 0-gate, or 0-Ga. At 
level 1, each ideal gate is simulated by a 1-rectangle, also called a 1-Rec, which acts on the code blocks (called 
1-blocks) of some quantum error-correcting code C. At level 2, an ideal gate is simulated by a 2-Rec, which 
acts on the code blocks (called 2-blocks) of the concatenated quantum code Co C. This 2-Rec is constructed 
by replacing each 0-Ga in the 1-Rec by a 1-Rec. And so on — a fc-Rec is constructed by replacing each 0-Ga 
in the (k— 1)-Rec by a 1-Rec. 

In our analysis of the threshold, the level- 1 gadgets must have certain properties. We will state the proper- 
ties here, but we will postpone until Sec. 7 and 8 any detailed discussion of how gadgets with these properties 
are constructed. Readers who are unfamiliar with the principles of fault-tolerant quantum computing may 
wish to jump ahead now to Sec. 7 and 8 to see explicit circuits for the gadgets. 

The desired properties are a bit different for codes of distance 5 and higher than for distance-3 codes. 
In order to proceed as briskly as possible to our proof of the threshold theorem for concatenated distance-3 
codes, we will focus here on the distance-3 case, and will return to higher-distance codes in Sec. 9 and 10. 

Our 0-Ga's include all of the quantum gates comprising a universal set; in addition there is a 0-preparation, 
which prepares a qubit in a standard state, and a 0-measurement, which destructively measures a qubit in a 
standard basis and records the outcome as a classical bit. It will be convenient to suppose that there are two 

0- measurements, measurement in the Z-eigenstate basis and the A-eigcnstate basis, and two 0-preparations, 
the preparation of the Z eigenstate |0) with eigenvalue 1 and of the X eigenstate |+) with eigenvalue 1. For 
each 0-Ga there is a corresponding 1-Rec, and in the level- 1 simulation, each 0-Ga in the ideal circuit is 
replaced by its corresponding 1-Rec. 

The 1-Recs are constructed using a level-1 error-correction gadget 1-EC, and level-1 gate gadgets, the 

1- Ga's. If the 1-EC is executed with no faults, it corrects one error — it maps the input E a \ip) to the output 

where |^>) is a state in the code space, and E a is a Pauli operator with weight w < 1. If the distance-3 
code is not "perfect," there may be some error syndromes that indicate that more than one qubit has been 
damaged, and in that case successful error recovery might not be possible. When the 1-EC encounters an 
ambiguous syndrome, it maps its input to some state in the code space. For most of our discussion, it will 
not matter exactly how this codeword is chosen. 

For each unitary transformation U in our universal gate set, there is a level-1 gate gadget or 1-Ga; if 
the 1-Ga is executed with no faults, it applies U to the encoded state. The 1-Rec for the ideal gate U 
consists of the corresponding 1-Ga followed by a 1-EC acting on each of the output 1-blocks of the 1-Ga, as 
shown in Fig. 1. In addition, there is a 1-preparation that prepares a 1-block in the standard code state |0) 
(or in the conjugate state |+)), and a 1-measurement that destructively measures an encoded qubit in the 
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Fig. 1. Level-1 simulation. Each O-Ga in the ideal circuit is replaced by a 1-Rec, which consists of 
the 1-Ga that simulates the O-Ga, followed by a 1-EC acting on each output 1-block of the 1-Ga. 



standard basis {|0), |1)} (or in the conjugate basis {|+), |— )}), and records the outcome as a classical bit. 
The preparation 1-Rec consists of 1-preparation followed by 1-EC, and the measurement 1-Rec is the same 
thing as the 1-measurement. 

If \ip) is an ideal state of a qubit, and \-ip) is the corresponding encoded state of a 1-block, we say that 
the actual state p of the 1-block has at most one error if its purification can be expanded as eq. (1), where 
the sum is restricted to Pauli operators with weight w < 1. By following the principles of quantum fault 
tolerance, we can construct a 1-EC and 1-Ga's with the following properties: 

0. If a 1-EC contains no fault, it takes any input to an output in the code space. 

1. If a 1-EC contains no fault, it takes an input with at most one error to an output with no errors. 

2. If a 1-EC contains at most one fault, it takes an input with no errors to an output with at most one 

error. 

3. If a 1-Ga contains no fault, it takes an input with at most one error to an output with at most one error 

in each output block. 

4. If a 1-Ga contains at most one fault, it takes an input with no errors to an output with at most one 

error in each output block. 

Property 1 is just the statement that 1-EC is an error recovery circuit for a code that can correct one error, 
while properties 2-4 express that the gadgets do not propagate errors badly. (In the case of property 3, the 
input is required to have at most one error all together acting in all input blocks; this error might propagate 
to other blocks, but it does not propagate to other qubits in the same block.) Property holds if we adopt 
a suitable convention for recovering when the error syndrome indicates more than one error. How gadgets 
satisfying properties 0-4 can be constructed will be discussed in more detail in Sec. 7 and 8. 

We can also construct a 1-preparation (which has no input) such that a 1-preparation with one fault 
produces an output with one error; this is a special case of property 4. And we can construct a 1-measurement 
(which has a classical bit as output) that agrees with an ideal measurement if either its input has one error 
and the 1-measurement has no faults, or its input has no errors and the 1-measurement has one fault; these 
can be viewed as special cases of properties 3 and 4. 

Actually, when we assert that a 1-measurement with one fault successfully measures a 1-block with no 
errors, we are implicitly assuming that either the outcome of the measurement is stored in the block of 
a classical error-correcting code, or if the outcome is decoded to a single bit, that the classical gates that 
decode the outcome are perfect. Otherwise, a single fault in the final decoding step could cause an error in 
the outcome. 

We would like to state a criterion that, if satisfied, ensures that the level-1 simulation of the ideal circuit 
is reliable. For this purpose it is convenient to group each 1-Rec together with the preceding 1-ECs that act 
on its input blocks; we call this composite object an extended rectangle or 1-cxRec. Note that the extended 
rectangles can overlap with one another, as in Fig. 2. Let us refer to the 1-ECs in an cxRec that precede the 
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Fig. 2. Overlapping extended rectangles. Two consecutive 1-exRecs (indicated by dashed lines) 
share a 1-EC, which is a trailing 1-EC of the earlier 1-exRec and a leading 1-EC of the later 
1-cxRec. 



1-Ga as the leading error corrections in the exRec and the 1-ECs that follow the 1-Ga as the trailing error 
corrections in the exRec. Then a trailing 1-EC of a 1-exRec is also a leading 1-EC of a 1-exRec that follows. 
The 1-exRecs have an important property that follows from the properties 0-4 of the 1-gadgets: 

Lemma 1. exRec-Cor at level 1. Suppose that the level-1 gadgets obey properties 0~4- Then if a 1-exRec 
contains no more than one fault, and the input to its 1-Rec has no more than one error in each input block, 
its output has no more than one error in each output block. 

Let us say that a 1-exRec is good if it contains no more than one fault, and that a 1-Rec is correct if it takes 
an input with no more than one error per 1-block to an output with no more than one error per 1-block. 
Then the property exRec-Cor can be stated more succinctly as 

exRec-Cor at level 1. The 1-Rec contained in a good 1-exRec is correct. 

Proof of Lemma 1: First suppose that none of the leading 1-ECs of the 1-exRec contain any faults. Then 
the output of each leading 1-EC is a codeword by property 0. But if the output of the 1-EC (which is one of 
the input blocks to the 1-Rec) has at most one error and is also a codeword, then it has no errors. Therefore 
the input to the 1-Rec actually has no errors. Now, the 1-Rec might contain one fault, which could be in the 
1-Ga, or could be in one of the trailing 1-ECs. If the 1-Ga contains a fault, then by property 4 its output 
has no more than one error in each block, and by property 1 these errors will be corrected by the trailing 
1-ECs. If the 1-Ga contains no faults, then its output has no errors by property 3, and since each trailing 
1-EC has at most one fault, each output block from the 1-Rec has at most one error by property 2. 

On the other hand, suppose that one of the leading 1-ECs in the 1-exRec contains a fault. Then each of 
the other leading 1-ECs contains no faults, and outputs a codeword by property 0. Therefore, if each input 
block to the 1-Rec has at most one error, then in fact all but one of the input blocks have no errors. Now 
there are no faults contained in the 1-Ga or in any of the trailing 1-ECs. Therefore each output block of 
the 1-Ga has no more than one error by property 3, and the output of each trailing 1-EC has no errors by 
property 1. 

These arguments also apply to the 1-preparation exRec (which is the same as the 1-preparation Rec), 
and to the 1-measurement exRec (for which correctness means that the measurement reproduces a perfect 
measurement of an ideal 1-block). 

□ 

If all 1-exRecs are good, then our level-1 simulation will be successful (it produces exactly the same 
probability distribution for the final readout as the ideal circuit). We simply observe that the initial 1- 
preparations produce input blocks with at most one error, and that for every 1-Rec that follows, each input 
block has at most one error so that each output block also has at most one error by exRec-Cor. Finally, the 
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1-measurements at the end of the circuit simulate the ideal measurements of the output blocks faithfully. 
The simulation works because the goodness of the exRecs ensures that each error caused by a fault gets 
corrected before it can be joined by a second error in the same block that would cause the simulation to fail. 

Suppose that the ideal circuit contains L locations, and suppose that stochastic faults occur independently, 
with probability e, at each location within the noisy quantum circuit used in our lcvel-1 simulation. (A 
"location" can be a O-preparation, a O-measuremcnt, or a gate, including an identity gate acting on a 
"resting" qubit.) For the simulation to fail, there must be at least two faults in at least one 1-exRec. For 
each specified pair of locations inside a 1-exRec, failure occurs at both of those locations with a probability 
no larger than e 2 . Therefore, the probability of failure for the level-1 simulation can be bounded as 

Pfaii < LAs 2 , (3) 

where A is the number of pairs of locations in the largest 1-exRec. Of course, in this estimate we are 
being overly pessimistic, because not all pairs of fault locations in the 1-exRec will cause the 1-exRec to be 
incorrect. We will return to this point in Sec. 6. 

Thus, for independent stochastic errors, using quantum error correction and fault-tolerant gadgets reduces 
the probability of failure per gate from e to 0(e 2 ); quantum coding improves the reliability of the quantum 
computation if the fault rate e is small enough. 

3 Recursive simulation: goodness and correctness 

Fault-tolerant simulation at level 1 achieves a modest improvement of the failure probability per gate, from 
£ to 0(e 2 ). Further improvement to a higher power of e can be attained by using codes that correct more 
errors. But as the codes get more complex, so do the rectangles, and for many families of codes one finds that 
fault-tolerant protocols are effective only for smaller and smaller values of e as the code's distance grows. 

If our goal is to compute reliably for as large a value of the fault rate as possible, the best known strategy 
is to use a recursive simulation, as in Fig. 3. In this scheme, a fault-tolerant gadget at level k is constructed 
by replacing each level-0 location in the level-(fc— 1) gadget by the corresponding level-1 rectangle. (This 
includes the identity gate — that is, a "resting" qubit that is not acted on by any gate in a particular time 
step is simulated by a 1-EC, which is the 1-Rec for the identity.) Equivalently, we may say that the level- fc 
gadget is constructed by replacing each O-Ga in the lcvel-1 gadget by the corresponding (k— 1)-Rec. One 
then hopes to achieve an arbitrarily reliable simulation by observing that, once s is small enough, the failure 
probability per gate declines each time the level of the simulation increases by one. It is almost obvious 
that this idea is sound, but it will be important to choose our definitions carefully to ensure that the proof 
can proceed smoothly. To prove the threshold theorem, we wish to extend our observation that "a good 
rectangle is correct" to higher levels than k = 1. That is, we are to argue that if the faults in a fc-Rec arc 
sufficiently sparse (the fc-Rec is good) , then it takes accurately encoded inputs to accurately encoded outputs 
(the fc-Rec is correct). The key is to find a suitable definition of "good" and "correct" so that we can easily 
establish that "a good rectangle is correct" by an inductive argument. 

The strategy behind our proofs of the threshold theorem, based on appropriate notions of goodness and 
correctness, is inspired by the work of Aharonov and Ben-Or [2]. Similar notions were applied earlier in the 
theory of fault-tolerant classical computation, for example by Gacs [23] . 

3. 1 Goodness 

The obvious way to generalize the concept of goodness to higher levels would be to say that a (fc+l)-cxRcc 
is good if it contains no more than one bad fc-exRec; however, that choice would lead to problems, because 
consecutive exRecs overlap with one another. When we estimate the probability that a (fc+l)-cxRec is bad, 
we wish to regard two bad /c-exRecs contained in the (fc+l)-cxRec as independent events, but consecutive 
bad fc-exRecs might not be independent. For example, the fc-EC shared by two consecutive fc-exRecs might 
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Fig. 3. A recursive simulation. A level-fc gadget is built from level-(fc— 1) gadgets, which are built 
from level- (fc— 2) gadgets, and so on. 
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Fig. 4. Two nonindcpcndcnt consecutive bad 1-exRecs, indicated by dashed lines, with fault 
locations indicated by X . Because one of the three faults is contained in the shared 1-EC, the 
bad 1-exRecs are not independent events. In this situation, we may regard the earlier 1-Ga as a 
gadget that simulates the corresponding ideal O-Ga accurately, and the later 1-exRec as a gadget 
that simulates the corresponding ideal O-Ga inaccurately. 



contain two bad (k— l)-exRecs, causing both consecutive fc-exRccs to be bad. Or one bad (k— l)-exRec in 
the shared k-EC might combine with one bad (k— l)-exRec in the k-Kec that follows the k-EC and with one 
bad (k— l)-exRcc in the fc-Ga or fc-EC that precedes it to cause both consecutive fc-exRecs to be bad. 

Fortunately, there is a simple solution to this predicament. We will say that a (fc-fT)-exRec is bad only if 
it contains two bad /c-exRecs that actually do fail independently of one another. Thus a bad (fe+l)-exRec is 
permitted to contain two overlapping bad fc-exRecs, but only if the earlier of the two consecutive fc-exRecs 
would have been good were it not for the bad (k— l)-exRecs contained in the fc-EC it shares with the following 
fc-exRec. Nonindependent consecutive pairs of fc-exRecs are acceptable, because we will be able to argue 
that they are really no more harmful than a single bad fc-exRcc, at the later of the two consecutive locations. 

Suppose for example, that two consecutive 1-exRecs are both bad, due to a total of three faults — one 
in the shared 1-EC, one in the following 1-Rec, and one in the preceding 1-ECGa (we use 1-ECGa to denote 
leading 1-ECs followed by a 1-Ga), as illustrated in Fig. 4. These two overlapping 1-exRecs contain two 
1-Rccs, which simulate a pair of O-Ga's in the ideal circuit. Ordinarily we would think of the earlier of the 
two 1-Recs as the gadget that simulates the earlier of the two O-Ga's. But in this case we may instead 
regard the 1-Ga contained in the earlier 1-Rec as the gadget that simulates the earlier O-Ga, and because 
the 1-ECGa contains only one fault, the 1-Ga simulates the earlier O-Ga accurately. We may also regard 
the later 1-exRec (rather than the 1-Rec it contains) as the gadget that simulates (inaccurately) the later 
O-Ga. In this sense the overlapping pair of bad 1-cxRecs is really no worse than a single bad 1-exRec. Our 
inductive proof (see Sec. 5.2) will show that this idea also works at higher levels. 

We note that the opening gates in a gadget (those acting in the first time step) are simulated by Recs 
whose exRecs are not strictly speaking "contained in" the gadget (the leading EC of the exRec is actually 
part of the preceeding gadget). Nevertheless, in a slight abuse of language, we will say that an opening 
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(k— l)-cxRec in a fc-gadget is contained in the fc-gadgct. 

When we estimate the probability that a k-exRec is bad, we wish to regard the bad (k— l)-exRecs 
contained in the k-exRec as independent events. Therefore, we define badness as follows: 

Definition. Goodness and Badness. A 1-exRec is bad if it contains two faults; if it is not bad it is good. 
Two bad 1-exRecs are independent if they are nonoverlapping or if they overlap and the earlier 1-exRec is 
still bad when the shared 1-EC is removed. For k > 1, a k-exRec is bad if it contains two independent bad 
(k-l)-exRecs; if it is not bad it is good. Two bad k-exRecs are independent if they are nonoverlapping or if 
they overlap and the earlier k-exRec is still bad when the shared k-EC is removed. 

To ensure that the earlier (truncated) fc-exRec and the later (complete) fc-exRec are really independent, it 
is important that no O-Ga is contained in both. Thus when we say that "the shared fc-EC is removed" from 
the earlier k-exRec, we mean that the complete (k— l)-exRecs in the first time step of the later fc-exRec 
are excluded from the earlier fc-exRec. Similarly, the complete (k— 2)-cxRecs in the first time step of each 
of these (k— l)-exRecs are excluded from the earlier fc-exRec, and so on. This point will be elaborated in 
Sec. 5.2.1. 

With this definition of badness, bad fc-exRecs become very unlikely as k increases, if the fault rate is 
sufficiently small. For independent stochastic faults occuring with probability e, as in eq. (3) the probability 
e^ 1 ) that a 1-exRec is bad satisfies 

e« < As 2 (4) 

where A is the number of pairs of locations in the largest 1-exRec. Because of the self-similarity of the 
fc-exRecs, and because independent bad (k— l)-cxRecs are independent events, the probability that a 
k-exRec is bad satisfies 

e^<A(e^y , (5) 

which together with eq. (4) implies 

Lemma 2. Bad exRecs are rare. Suppose that stochastic faults occur independently, with probability e, 
at each circuit location in a k-exRec. Then the probability that the k-exRec is bad satisfies 

e (k) < s (e/eof , (6) 

where Sq 1 is the number of pairs of locations in the largest 1-exRec. 

This £o is a lower bound on the accuracy threshold (and in Sec. 6 we will see that this estimate can be 
improved); for e < eo, the probability of badness declines double-exponentially with k. 

3.2 Correctness 

How should we define correctness? First we need a notion of what it means for a state to be "accurately 
encoded" at level k. Since in our recursive simulation wc will read out the result of the computation using 
a recursive measurement procedure, an accurate encoding should be one that can be successfully decoded 
recursively. For the purpose of defining our notion of correctness for noisy circuits, we will employ an ideal 
decoder at level k (a fc-decoder) — a conceptual device that maps a level-fc encoded block to a single qubit 
[24]. The level- 1 ideal decoder measures and records the error syndrome, performs recovery as indicated by 
the syndrome, then decodes the 1-block to a qubit, and finally discards the syndrome. We use the word 
"ideal" to emphasize that this procedure is carried out flawlessly — there are no faults. The level- A; decoder 
is defined recursively; it is realized by first applying the {k— l)-decoder to each of the {k— l)-subblocks, and 
then applying the 1-decoder to the resulting 1-block. 

When we say that a k-Rec is correct, we mean that it simulates the corresponding ideal gate accurately. 
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Definition. Correctness. A k-Rec is correct if the k-Rec followed by the ideal k-decoder is equivalent to 
the ideal k-decoder followed by the ideal O-Ga that the k-Rec simulates: 



correct 




ideal 




ideal 




ideal 


fc-Rec 




fc-decoder 




fc-decoder 




O-Ga 



To be specific, suppose that the ideal O-Ga applies the unitary transformation U to a qubit or to several 
qubits. Suppose that the input to the fc-Rec is a state p such that the ideal decoder maps p to the "ideal" 
pure state \ip). According to our criterion, then, if the fc-Rec is correct, the ideal decoder maps the output 
of the fc-Rec to U\ip). In this sense, our notion of correctness captures the idea that the fc-Rec maintains the 
decodability of states. 

There is a similar notion of correctness that applies to the fc-preparation and the fc-measurement. A 
preparation fc-Rec (fc-preparation followed by fc-EC) is correct if the ideal fc-decoder maps its output to the 
ideally prepared state, e.g., |0) or |+): 



correct 
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A fc-measurement is correct if it realizes the same POVM as the ideal fc-decoder followed by the ideal 
0- measurement: 
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0-meas. 



The crucial property of fc-rectangles used in the proof of the threshold theorem is: 
exRec-Cor. The k-Rec contained in a good k-exRec is correct. 
In other words, good fc-exRecs satisfy the following identities: 



— fc-EC - fc-Rec - 
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0-meas. 



We note that, for k = 1, this formulation of exRec-Cor is actually somewhat stronger than the formulation 
used in Sec. 2 — here the input to the exRec is unrestricted, while the criterion for correctness used in Sec. 2 
stipulates an input to the Rec that has at most one error. 
In Sec. 5 we will prove the important 

Lemma 3. Good implies correct. Suppose that all types of level- 1 rectangles satisfy property exRec-Cor. 
Then exRec-Cor holds for all types of rectangles at each level k > 1. 

It follows from Lemma 3 that our level-fc fault-tolerant simulation of an ideal quantum circuit will succeed 
if every fc-exRec is good. To reach this conclusion, we first use the correctness of the final fc-measuremcnts 
to replace each fc-measurement by an ideal fc-decoder followed by an ideal 0-measurement. Then we use 
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the correctness of the A;-Recs that simulate the quantum gates to move the ideal /c-decoders to the left 
through the circuit, transforming each fc-Rec to an ideal O-Ga. Finally, using the correctness of the initial 
fc-preparations, we replace each fc-preparation by preparation of a qubit in the ideally prepared state. These 
steps show that the ideal circuit and its level-fc simulation are equivalent — both produce precisely the same 
probability distribution of outcomes for the final measurement. Thus we have 

Lemma 4. Good circuits match ideal answers. Suppose that all level-1 rectangles satisfy property 
exRec-Cor. Then if every exRec is good in a level-k simulation of an ideal circuit, the probability distribution 
for the final measurement outcomes in the simulation is exactly the same as the probability distribution for 
the final measurement outcomes in the ideal circuit. 

It is implicit in the formulation of these lemmas that we assume classical computation is reliable, or that 
the final outcome of the quantum computation is protected robustly in a classical code block rather than 
decoded to a single bit. If the classical gates are noisy and the outcome is decoded, then a single fault in 
the final decoding step could cause a 1-measurement to fail. Therefore 1-measurement would not satisfy 
exRec-Cor, and the lemmas would not apply. 

The notion of correctness that we have defined using the ideal decoder corresponds fairly closely to the 
notion of correctness that we used in our discussion of level-1 simulations, but there are also some important 
differences that should be noted. For one, our new formulation of correctness has been stated as a property 
of operations, without any explicit reference to states. This feature is advantageous since fc-blocks are 
typically highly entangled with other fc-blocks, and the goal of a fault-tolerant simulation is to maintain this 
entanglement; whether this goal is achieved cannot be judged by examining only the local action of each 
fc-Rec on its input blocks. 

Also, to see that the new version of exRec-Cor is true at level 1, we need to reformulate the properties 
0-4 of the level-1 gadgets listed in Sec. 2. We will postpone discussing the details of this reformulation 
until Sec. 9. But one point deserves emphasis now: an additional property also must be assumed to enforce 
exRec-Cor. Let us say that the state of a 1-block is valid if the purification of the state can be expanded as 
in eq. (1) where each Pauli operator E a has weight w < 1, and is a codeword. The additional necessary 
property is: 

0'. If a 1-EC contains one fault, it takes any input to a valid output. 

The property 0' is needed to ensure that, if one of the leading 1-ECs of the 1-exRcc has a fault, the input 
to the 1-Rec "has at most one error" all together in all input blocks — that is, to ensure that the simulated 
gate, followed by ideal decoding, agrees with ideal decoding followed by an ideal gate. If our distance-3 code 
is "perfect" (if every possible error syndrome point to at most one error), then the property 0' is automatic. 
But by following carefully the principles of quantum fault tolerance, we can build a 1-EC that satisfies 0' 
for any distance-3 code; we will return to this point in Sec. 7. In any case, for our proof of the threshold 
theorem, it will not be necessary to explain how gadgets that satisfy exRec-Cor are constructed at level 1; 
it will suffice just to know that such gadgets exist. 

We have stressed the applicability of our criterion for correctness to distance-3 codes, but we should also 
point out that it can be generalized to distance- (2t + 1) codes that correct t errors. In that case, we can say 
that a 1-exRcc is bad only if it contains faults in at least t + 1 locations. Then it is possible to construct 
fault-tolerant gadgets such that exRec-Cor holds at level 1, and by an inductive proof to establish exRec-Cor 
at level k as well. This generalization to higher-distance codes will be discussed in Sec. 9. 

4 The quantum threshold theorem 

The crux of the quantum threshold theorem is the proof of Lemma 3 asserting the property exRec-Cor 
("good implies correct") for fc-exRecs. We will postpone this proof until Sec. 5. First, we will explain how 
Lemma 3 is used to complete the proof of our main theorem. 
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How accurate is a level-A; fault-tolerant simulation? At the conclusion of the computation, some qubits 
are measured. Let {p- ldea1 ^} denote the probability distribution for the measurement outcomes for the ideal 
circuit, and let {p( actual ) } denote the probability distribution for the measurement outcomes for the actual 
noisy circuit. We define the error 5 of the noisy computation as the L 1 distance between these two probability 
distributions: 

S =|| p( actual ) - p ( idcal ) ||= | p ( actual ) _ p ( idoal )| _ (7) 

i 

Now, we know that if every k-exRec in the simulation is good, then p( actual ) = p^ dcai \ and furthermore bad 
fc-exRecs are rare if the fault rate is low and k is large. Let us say that the fc-simulation fails if any fc-exRec 
is bad. Then if the ideal circuit has L locations, and stochastic faults occur independently with probability 
e < £o at each location in the noisy circuit, the probability of failure can be bounded as 

P^<LsW<s L(s/e Q f , (8) 

using eq. (6). 

By averaging over the fault locations in the level- fault-tolerant circuit, we find 

(actual) _ /, _ p (fe)\ (ideal) p (fe) (fail) / Q x 
Pi — y l Mail J Pi Mail Pi ' W 

for some distribution {pf^}- Therefore, we find an upper bound on the error 

6 = p fin ' \P^ ^ id ° al) l < 2P £l < 2£ oL (e/e a f (10) 

i 

(since the maximal L 1 distance between any two probability distributions is 2). Rearranging, we see that 
error 5 or better can be achieved by choosing 

2' > (!2lfi£2iM) . (U ) 

V log(e /e) / 

Suppose that the largest size (number of locations) of any 1-Rec is t and that the largest depth (number 
of time steps) of any 1-Rec is d. Then because of the self-similarity of the fc-Recs, no fc-Rec can have size 
larger than l k — ^2 k ) l ° S2£ and no k-Rec can have depth larger than d k — (2 fe ) log2<i . From eq. (11) wc 
therefore obtain 

Theorem 1. Quantum accuracy threshold for independent stochastic noise. Suppose that fault- 
tolerant gadgets can be constructed such that all 1-exRecs obey the property exRec-Cor, and such that I is the 
maximal number of locations in a 1-Rec, d is the maximal depth of a 1-Rec, and £q 1 is the maximal number 
of pairs of locations in a 1-exRec. Suppose that independent stochastic faults occur with probability e < eo at 
each location in a noisy quantum circuit. Then for any fixed 5, any ideal circuit with L locations and depth 
D can be simulated with error 8 or better by a noisy circuit with L* locations and depth D* , where 

L* =0(L(logL) log ^) , D* =0(D (log L) los * d ) . (12) 



Theorem 1 is our main result. What is new is the connection between the accuracy threshold and properties 
of extended rectangles, which are satisfied by suitably designed level-1 gadgets for distance-3 codes. 

An implicit assumption in the statement of the theorem is that either classical computation is reliable or 
else the final output of the computation is protected in a classical code block. If classical gates are noisy, and 
we wish to decode the final outcome to a single bit, then a single fault in the final decoding step could cause 
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an error in the output. For this reason, it would not be possible to construct a 1-measurement that obeys 
exRec-Cor, and the theorem would not apply. There is still a quantum accuracy threshold if the classical 
computation is noisy (or if all gates are noisy quantum gates), but the error S cannot be arbitrarily small 
in that case; rather it is limited to 6 = 0(e). We are also taking it for granted that all input qubits are 
prepared in standard known states such as |0). Only then can we construct a 1-preparation with property 
exRec-Cor. If the input includes qubits in unknown states, a single fault in the first encoding step could 
cause an error in the computation. 

As we will discuss in Sec. 9, Theorem 1 can be generalized to codes of higher distance (for which the 
threshold estimate £o may be lower but the overhead may scale more favorably). For distance-3 codes, our 
estimate of the accuracy threshold can be further improved; see Sec. 6. 

5 Good implies correct 

5.1 The threshold dance 

To complete the proof of the threshold theorem, it only remains to prove Lemma 3: "good implies correct." 
The proof will be by induction on the level k of the simulation. We assume that level- 1 gadgets have been 
constructed so that the property exRec-Cor is satisfied. Then we must show that if exRec-Cor holds at level 
k it also holds at level k + 1 . 

The idea underlying the inductive step is quite clear. We may regard each (fc+l)-gadget as a simulation 
of the corresponding 1-gadget, where each gate is replaced by a fc-Rec. Then we invoke the induction 
hypothesis at level k to justify that this circuit composed of /c-Recs simulates the 1-gadget accurately, and 
invoke property exRec-Cor at level 1 to complete the induction step. We have chosen our definitions and 
properties so that this idea can be realized relatively easily. 

Specifically, our proof exploits the recursive construction of the ideal decoder — decoding of a (fc+l)-block 
is achieved by first decoding each fc-subblock to a qubit, and then decoding the resulting 1-block. To show 
that a (fc+l)-Rec is correct, we view the (fc+l)-decoder as a fc-decoder acting on each fc-subblock, followed 
by the 1-decoder. Repeatedly using the property exRec-Cor at level k, we steadily move the fc-decoders 
to the left, one fc-Rec at a time, until they reach the front of the (fc+l)-exRec; thereby we transform the 
(fc+l)-exRec to a 1-exRec. If the original (fc+l)-exRec is good, so is the resulting 1-exRec (see Sec. 5.2); 
therefore using exRec-Cor at level 1, we can move the 1-decoder to the front of the 1-Rec, transforming it 
to the corresponding ideal O-Ga. Finally, we can sweep the fc-decoders back to the right to the front of the 
(fc+l)-Rec, replacing the leading 1-EC of the 1-exRec by the original (fc+l)-EC, and reuniting the fc-decoders 
with the 1-decoder to reassemble the (fc+l)-decoder. This completes the demonstration of exRec-Cor at level 
(fc+1). 

We affectionately refer to this maneuver, in which the fc-decoders first sweep forward to the front of 
the (fc+l)-exRec, the 1-decoder follows to the front of the resulting 1-Rec, and then the /c-decoders sweep 
backward to rejoin the 1-decoder, as the threshold dance. See Fig. 5. 

5.2 Bad rectangles as simulated faults 

The idea behind the argument sketched above is that, by moving the fe-decoders to the left through a good 
(fc+l)-exRec, we transform it to a good 1-exRec. It is clear that the property exRec-Cor at level k allows 
us to move a fc-decoder through the fc-Rec contained in a good fc-exRec, transforming the fc-Rec to an ideal 
O-Ga. But a good (fc+l)-exRec might also contain a bad fc-exRec, or a consecutive pair of nonindependent 
bad fc-exRecs. Intuitively, we should be able to regard each fc-Rec contained in a bad fc-exRec as a simulation 
of a faulty O-Ga contained in the 1-exRec, and as suggested in Sec. 3.1, we expect that a consecutive pair of 
nonindependent bad fc-exRecs can be regarded as a simulation of a pair of O-Ga's such that only the later of 
the two is faulty. Then the 1-exRec becomes a circuit with a single fault, and is good. 

To complete the argument, then, we need to explain what happens as the fc-decoder sweeps past a bad 
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Fig. 5. The threshold dance, shown schematically here, is the pivotal maneuver in the inductive 
proof that a good exRec is correct. 



fc-exRec or a nonindependent pair of consecutive fc-exRccs. Let us first discuss how we treat the case of two 
nonindependent bad fc-exRecs. 

5.2.1 Nonindependent pairs of bad k-exRecs 

By definition, if two bad fc-exRecs comprise a nonindependent consecutive pair, the earlier of the two consec- 
utive fc-exRecs is good when the shared fc-EC is removed. Therefore, when the fc-decoders, as they migrate 
to the left, reach the consecutive pair of bad fc-exRecs, we may proceed as follows. The fc-decoders first 
leap past the later of the two bad fc-exRecs — past the entire fc-exRec, rather than past the fc-Rec that this 
fc-exRec contains. As explain in Sec. 5.2.2 below, the bad fc-exRec is thereby effectively replaced by a faulty 
O-Ga. Next we are to move the fc-decoders another step to the left, past the fc-Rec contained in the earlier 
of the two bad fc-exRecs. Because the fc-decoder leapt past the shared fc-EC in the previous step, the earlier 
fc-Rec has been truncated — it is now missing the trailing fc-EC that it shares with the later bad fc-exRec. 

Actually, to enforce the independence of the bad fc-exRec and the truncated fc-exRec that precedes it, 
we must ensure that no level-0 gate is contained in both; therefore we should specify carefully where the 
boundary lies between the later untruncated fc-exRec and the earlier truncated fc-exRec. We define this 
boundary by excluding from the earlier truncated fc-exRec every O-Ga that is "contained in" the later fc- 
exRec. For example, the opening (fc— l)-exRecs "contained in" the later fc-exRec have leading (fc— l)-ECs 
that are considered to be part of the later fc-exRec, and are not included in the earlier truncated fc-exRec. 
Similarly, at the next level down, the opening (fc— 2)-cxRecs "contained in" the opening (fc— l)-exRecs of the 
later fc-Rec have leading (fc— 2)-ECs that are considered to be part of the later fc-exRec and are not included 
in the earlier truncated fc-exRec. And so on at all lower levels. 

Now we need to consider what happens when we move the fc-decoders further to the left, past the earlier 
truncated fc-exRec. For this purpose, we can use an obvious identity that follows from the definition of the 
j-decoder: 
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Invoking this identity, we may replace the j-ECs that have been amputated (j = 1, 2, 3, . . . , k) by ideal j-ECs 
(ones with no faults). This restoration of the amputated ECs proceeds level by level: first the final truncated 
1-exRecs are augmented by adding ideal trailing 1-ECs, then the final truncated 2-exRecs are augmented 
by adding ideal trailing 2-ECs, and so on at all higher levels. At each step, we can justify adding the ideal 
j-ECs by observing that the fe-decoder can be realized as j-decoders applied to all j-subblocks, followed by 
a (k— j)-decoder. 

At this point, the fc-decoders are behind the complete fc-exRec with restored fe-ECs. Furthermore, if the 
truncated fc-exRec is good (does not contain two nonindependent bad (k— l)-exRecs), then the completed 
fc-exRec is also good, because there are no bad (j— l)-exRecs in the j-ECs that we inserted (j = 1, 2, 3, . . . , k). 
Therefore, using exRec-Cor at level k, we may move the fc-decoders to the left of the fc-Ga, transforming it 
to an ideal O-Ga. We conclude, in other words, that the property exRec-Cor for complete fc-exRecs implies 
that exRec-Cor is also true for truncated fc-exRecs. 

Thus we have shown, as desired, that as the fc-decoders move to the left, a nonindependent pair of bad 
fc-Recs is transformed to an ideal O-Ga followed by a faulty O-Ga. If, on the other hand, the truncated 
fc-exRec is bad, we move the fc-decoders past the entire truncated fc-exRec — the pair of independently bad 
fc-exRecs is transformed to a pair of faulty O-Ga's, and the /c-exRecs that immediately precede the bad pair 
become truncated. 

5.2.2 Transforming a bad k-exRec to a faulty O-Ga 

It still remains to justify moving the /c-decoders to the left through a bad fc-exRec (or a bad truncated k- 
exRec), transforming the fc-exRec to a faulty implementation of the ideal O-Ga. Here an annoying technical 
point arises. Our induction hypothesis prescribes no special properties of the bad rectangles, which therefore 
should be regarded as arbitrary operations (trace-preserving completely positive maps). Thus a bad k-exRec 
cannot necessarily be regarded as a level-fc simulation of a well-defined (faulty) level-0 gate, because the bad 
fc-exRec might map distinct valid encodings of the same ideal input to valid encodings of two distinct ideal 
outputs. (We say that a state p of a fc-block is a valid encoding of an ideal pure state of a qubit if the 
ideal decoder maps p to If we attempt to move the ideal fc-decoder from behind a bad k-exRec to in 

front of it, and in so doing transform the bad k-exRec to a faulty O-Ga, the snag is that the particular faulty 

0- Ga we obtain may depend on the syndrome that the ideal decoder measures. For example, at level k = 1, 
errors due to faults in the 1-exRec might combine differently with an input error on the first qubit in the 

1- block than with an input error on the second qubit in the 1-block; then applying the ideal 1-decoder to 
the output of the 1-exRec in these two cases might yield different output states, even though applying the 
ideal 1-decoder to the input yields the same state in both cases. 

This observation indicates that if we are to transform a bad k-exRec to a level-0 fault, in doing so we 
cannot completely disregard the error syndromes of the incoming /c-blocks. Nevertheless the transformation 
is possible, if we allow the syndrome to assume the role of an "environment" that interacts with the level-0 
data whenever a 0- fault occurs. 

Up until now, we have prescribed that the ideal fc-decoder discards the error syndrome after performing 
the error-recovery operation, but for analyzing the action of bad k-exRecs on the data, it will be convenient 
to consider the error syndrome to be part of the output from the ideal decoder. For clarity, we will refer to 
the fc-decoder that retains the syndrome information as the fc-*decoder. Let |V>) denote the ideal encoding 
in a lcvcl-fc concatenated code of the single-qubit state \tp), and let {Ei} denote the set of correctable Pauli 
errors acting on the fc-block; then the action of the A:-*decoder (denoted V) is 

V : Ei$) i ^ |V) ® |t) , (13) 
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where \i) denotes the state of the register that records the syndrome. For a perfect code, (or for a perfect 
code of the CSS type), the states {Ei\ip)} are a complete basis, and eq. (13) completely characterizes the 
action of the fc-*decoder on the fc-block. If the code is not perfect, this action can be extended to a larger 
set of Pauli errors that includes some non-correctable errors, such that the states {E^)} are mutually 
orthogonal and complete. Then V is invertible, and its inverse V^ 1 is a fc-*encoder, which takes the input 
qubit and input syndrome \i) to the encoded state \4>) with error Ei. An ideal fc-decoder is just the 
fc-*decoder V followed by disposal of the output syndrome: 



ideal 
fc-decoder 



Suppose we assume the induction hypothesis, that the property exRec-Cor holds at level k. Therefore, 
if a fc-exRec is good, we can move a fc-decoder that follows the fc-exRec to the left, past the fc-Rec contained 
in the good fc-exRec, converting the fc-Rec to the corresponding ideal O-Ga. The same property still holds 
when we move the fc-*decoder to the left instead. As far as the data is concerned, there is no difference 
between the fc-decoder (which discards the syndrome that it measures) and the fc-*decoder (which retains 
the syndrome). The only question is: what happens to the syndrome when the fc-*decoder moves left? 

For simplicity, consider a good fc-exRec that acts on a single input fc-block. Moving the k-* decoder 
V past the noisy fc-Rcc M generates an operation VMV^ 1 whose input consists of a single qubit and a 
syndrome (here our operator ordering convention is that the operator furthest to the right acts first), and 
the action of VAAV^ 1 on the reduced state of the qubit alone, for any input, is the ideal unitary gate .Mideai- 
Therefore, the action of VA4V" 1 on the qubit must be uncorrelated with its action on the syndrome, for 
otherwise the qubit would decohere. That is, VMV~ X is a tensor product 



VMV 1 = TWidcai ® Af synd 



(14) 



where A'fsyndromc is a trace-preserving operation acting on the syndrome alone that depends on the details 
of the noise in the fc-Rec: 
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In effect, then, the fc-Recs contained in good fc-exRecs perform two independent quantum computations 
in parallel: the ideal processing of the encoded data, and the noisy processing of the syndrome. For our 
purposes, the details of how the syndrome is processed are not relevant; all that matters is that a good 
noisy circuit processes the data and the syndrome independently, so that in principle we can propagate the 
syndrome through the good noisy circuit without interfering with the ideal evolution of the data. Since good 
truncated fc-exRecs are also correct, as explained in Sec. 5.2.1, they too process the encoded data and the 
syndrome independently. 

For a bad fc-exRec, on the other hand, the processing of the encoded data and of the syndrome are not 
independent. If J\f denotes the action of the bad fc-exRec on the data and syndrome, then we may write 
VM = TV, where 

T = VMV' 1 (15) 



can be regarded as the level-0 fault simulated by the bad fc-exRec; diagrammatically, 
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In contrast with the ideal level-0 gate simulated by the A:-Rec contained in a good fc-exRec, the faulty level-0 
operation simulated by a bad fc-exRec depends on the syndrome that is input to the fc-*encoder. Moving the 
fc-*decoder left past a bad fc-exRec transforms the fc-exRec to an operation that actually acts collectively 
on the level-0 data and the syndrome. In effect, the syndrome functions as an "environment" that interacts 
with the data at the locations where level-0 faults occur. In a level-fc circuit that contains s independent bad 
fc-exRecs, moving the fc-*decoder left through the circuit transforms it to a circuit of O-Ga's, with s faults. 
Schematically, two bad fc-exRecs, with an intervening good fc-circuit, become transformed like this: 
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Under this transformation, the syndrome becomes an effective environment that remains isolated from the 
data during the ideal O-Ga's, but interacts with the data during 0- faults. Faults that share an environment 
can be strongly correlated with one another, but fortunately this feature does not interfere with the successful 
execution of the threshold dance. In the property exRec-Cor, the goodness of an exRec is determined only 
by the locations of the faults in the exRec — once the fault locations are determined, the action of the faults 
at those locations can be arbitrary. In particular, a good 1-exRec is correct even if the 0-faults contained in 
the 1-exRec share a quantum memory. 

Now, for the inductive step, we are to assume the property exRec-Cor at level k, and we are to prove 
exRec-Cor at level fc+1. A (fc+l)-decoder following a (fc+l)-exRec can be realized as /c-decoders acting 
on fc-blocks, followed by a 1-decoder, and each fc-decoder can be regarded as a fc-*decoder whose output 
syndrome is discarded. When the fc-*decoders sweep left to the front of a good (fc+l)-exRec, they transform 
it to a good 1-exRec. The 0-faults in this 1-exRec share access to an environment (the syndrome), but 
otherwise the evolution of the environment is independent of the evolution of the data. The good 1-exRec is 
correct, so we can move the 1-decoder left past the 1-Rec, transforming it to the corresponding ideal O-Ga. 
Finally, we move the fc-*decoders back to the right to join the 1-decoder. The outputs of these k-* decoders 
are discarded, so they are fc-decoders which together with the 1-decoder reconstitute the (fc+l)-decoder; this 
completes the proof of the inductive step. 

Straightforward modifications of this argument apply to (fc+l)-preparation (for which there are no leading 
ECs) and (/c+l)-measurement (for which there is no output block). 

We have considered (fc+l)-cxRccs that contain several embedded bad /c-exRecs in order to emphasize 
that this argument can be applied to codes of any odd distance 2t + 1, where a good (fc+l)-cx-Rec is defined 
to contain no more than t independent bad fc-exRecs. Higher-distance codes will be further discussed in 
Sec. 9. 
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6 Improving the threshold estimate 

6.1 Benign and malignant sets of locations 

In the threshold theorem as formulated in Sec. 4, we have estimated the accuracy threshold by counting 
all pairs of locations in the largest 1-exRec. But that estimate is too pessimistic, because there are many 
pairs of locations in the 1-exRec such that arbitrary faults at both locations do not cause the 1-Rec to be 
incorrect. Can we derive a sharper estimate using this observation? 

Let us say that a set of locations in a 1-exRec is benign if the 1-Rec contained in the 1-exRec is correct 
for arbitrary faults occuring at the locations in the set. If the set of locations is not benign it is malignant. 
Using a distance-3 code and fault-tolerant gadgets, we can ensure that any set containing only one location 
is benign. We used this property to prove the threshold theorem in Sec. 4. But there are many other benign 
sets of locations, a fact we can exploit to obtain improved rigorous estimates of the accuracy threshold. 

Our analysis must take into account that overlapping bad fc-rectangles need not be independent for k > 1. 
We may relax our definition of goodness: 

Definition. Goodness and Badness (revised). A 1-exRec is bad if it contains faults at a malignant 
set of locations; if it is not bad it is good. For k > 1, a k-exRec is bad if it contains independent bad 
(k-l)-exRecs at a malignant set of locations; if it is not bad it is good. 

Then cxRec-Cor is true at level-1 by definition. Furthermore, with this definition of goodness, the inductive 
proof of exRec-Cor at level k + 1 (via the threshold dance) proceeds simply. What makes this definition 
useful is that the arguments in Sec. 5.2 show that when A:-*decoders sweep from behind a good (fc+l)-exRec 
to in front of it, the (fc+l)-cxRcc is transformed to a good 1-exRec. 

6.2 Counting malignant pairs 

Enumerating all malignant sets of locations in a 1-exRec would be a combinatoric challenge. But it is a 
worthwhile and manageable task to count all malignant pairs of locations. 

We will outline how this counting can be done for a perfect distance-3 code of the CSS type, like the 
7-qubit Steane code; further details are provided in Sec. 8. Suppose we pick a pair of locations in the 1-exRec, 
and we wish to test whether this pair of locations is malignant. First we note that although in our error 
model we allow the faults at specified locations to be chosen adversarially, it suffices to test pairs of Pauli 
faults to check whether a pair of locations is benign, since an arbitrary fault can be expanded in terms of 
Pauli operators. We may therefore consider replacing a faulty single-qubit O-Ga by one of the four Pauli 
operators {/, X, Y, Z}, or replacing a faulty two-qubit O-Ga by one of the sixteen tensor products of Pauli 
operators in {/, X, Y, Z} (g> {/, X, Y, Z}. (Equivalently, we may consider inserting Pauli operators other than 
the identity right after or right before the gates.) 

Furthermore, we can re-express the criterion for correctness of a 1-Rec in terms of the propagation of 
Pauli errors through the 1-Rec. For a perfect distance-3 CSS code, we can choose a basis for a 1-block, such 
that each element of the basis deviates from the code space by at most one X error and at most one Z error. 
With the Pauli faults fixed at a particular pair of locations, we consider Pauli errors afflicting the input to 
the 1-exRec, with at most one X and at most one Z acting at arbitrary positions in each input 1-block. We 
propagate this input error through the leading 1-ECs of the 1-exRec, to find the Pauli error for the output 
of the 1-ECs, which is the input to the 1-Rec. By applying a logical X and/or logical Z as needed, the error 
in the input to the 1-Rec can also be expressed as at most one X and at most one Z. Then we propagate 
this error acting on its input through the 1-Rec to find the corresponding error acting on the output of the 
1-Rec. If, within any output block, there are two or more X errors acting on the block, or two or more Z 
errors, then the pair of locations we are testing has been found to be potentially malignant. (If the 1-Rec 
contains a O-Ga that is not in the Clifford group, then the output error might not be a Pauli error, but we 
can check whether the error has weight higher than 1.) 
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With the fault locations fixed, we conduct this test for each possible choice of the error acting on the 
input to the 1-cxRec, and for each possible choice of the Pauli faults acting at the fixed fault locations. If 
for each output block there is no more than one X error and no more than one Z error acting on the block 
for all such choices, then the tested pair of locations is benign. 

This analysis can be usefully partitioned into consideration of various cases. It is obvious that the pair 
is benign if both fault locations are in one of the leading 1-ECs, since then the 1-Rec has no faults. If each 
of two different leading 1-ECs has a fault (for a 1-Rec that acts on two or more input blocks), then again 
the 1-Rec has no faults, but we must check whether the 1-Ga propagates an error from one block to the 
other. The most delicate case is when there is one fault in one of the leading 1-ECs and one in the 1-Rec. 
This case can be further divided into subtasks. The location of the first fault, together with the input error, 
determines the error in the output from the leading 1-EC. Then, with the error in the input to the TRec 
fixed, we can determine whether a second fault at a particular location in the 1-Rec causes multiple errors 
in the output. 

6. 3 Refined calculation of the failure probability 

Now we should consider how the likelihood of a bad fc-exRec is affected by our revised definition of goodness. 
We denote the probability that a fc-exRec is bad by e^ k \ For a fc-exRec to be bad, either independent bad 
(k— l)-exRecs occur at a malignant pair of locations, or else there must be independent bad (k— l)-exRecs 
at three or more locations (in the latter case it might be that no two of the locations form a malignant pair). 
Therefore, since the independent bad (k— l)-exRecs may be regarded as statistically independent, an upper 
bound on is 

e (fc) < A( £ ( fe - 1 )) 2 +i?( £ ( fe - 1 )) 3 , (16) 

where A is the number of malignant pairs of locations, and B is the total number of ways to choose three 
locations (where it is not required that at least two of the three form a malignant pair). 
It follows from eq. (16) that 

eW<A'(e< k -Vy , (17) 

where 

e < e = (AT 1 , (18) 

is our threshold estimate and 

A' = A + Beo = A + B/A' => A' = ^A (l + yjl + 4B/A 2 ^) . (19) 

Hence we have 

Theorem 2. Quantum accuracy threshold for independent stochastic noise (revised). Suppose 
that fault-tolerant gadgets can be constructed such that all 1-exRecs obey the property exRec-Cor, and such 
that i is the maximal number of locations in a 1-Rec, and d is the maximal depth of a 1-Rec. Let e be the 
minimal value of A'~ l for any 1-exRec, where A' is given by eq. (19), A is the number of malignant pairs of 
locations in the 1-exRec, and B is the total number of ways to choose three locations in the 1-exRec. Suppose 
that independent stochastic faults occur with probability s < Eq at each location in a noisy quantum circuit. 
Then for any fixed S, any ideal circuit with L locations and depth D can be simulated with error S or better 
by a noisy circuit with L* locations and depth D* , where 

L* =0(L(logL) log ^) , D* =0 (D(log_L) log2 d ) . (20) 
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Sometimes, it is convenient to design gadgets that are nondeterministic — the gadgets include subroutines 
in which certain ancilla states are prepared and verified, where the ancilla is discarded if the verification 
test fails. In that case, strictly speaking a 1-exRec contains many ancilla preparations performed in parallel, 
where just one of these ancillas is accepted and used in the circuit that follows. It would be overly pessimistic 
to include all the locations in the preparation and verification of rejected ancillas in the estimate of B used 
in eq. (19), and it is possible to perform a refined count from which many sets of three locations can be 
excluded. Alternatively, by estimating the probability that an ancilla is accepted and by using Bayes' rule, 
we can bound the probability of failure for each type of exRec, given that all the ancillas used in that exRcc 
are successfully verified. In Sec. 8.3 we will use the latter method in conjunction with Theorem 2 to obtain 
an explicit lower bound on the quantum accuracy threshold: e > 2.73 x 10~ 5 . 

7 Construction of fault-tolerant gadgets: generalities 

The threshold theorem proved in Sec. 4 is premised on the existence of level- 1 gadgets (based on a distance-3 
code) that obey the property exRec-Cor. In this section we will explain how such gadgets can be constructed. 
Then in Sec. 8 we will use one such construction to obtain an explicit estimate of the quantum accuracy 
threshold. 

We have seen in Sec. 2 and 3 that exRec-Cor at level 1 holds for level-1 error correction and gate gadgets 
with the following properties: 

0. If a 1-EC contains no fault, it takes any input to an output in the code space. 

0'. If a 1-EC contains one fault, it takes any input to a valid output. (The state of a level-1 block is "valid" 
if it deviates from the code space by the action of a weight- 1 operator.) 

1. If a 1-EC contains no fault, it takes an input with at most one error to an output with no errors. 

2. If a 1-EC contains at most one fault, it takes an input with no errors to an output with at most one 

error. 

3. If a 1-Ga contains no fault, it takes an input with at most one error to an output with at most one error 

in each output block. 

4. If a 1-Ga contains at most one fault, it takes an input with no errors to an output with at most one 

error in each output block. 

Therefore, it will suffice to verify that the 1-gadgets satisfy properties 0-4. 
7.1 Stabilizer codes 

Before proceeding to explicit gadget constructions, we briefly review the theory of binary stabilizer codes 
[25, 26], which are well suited for applications to fault-tolerant quantum computing. The code space of a 
stabilizer code of length n is the simultaneous eigenspace of a set Q of commuting Pauli operators; these 
operators generate the code's stabilizer S, an abelian subgroup of the n-qubit Pauli group c[ n ^ = {{±1, ±i} • 
{I, X, Y, Z}®"}. Up to an overall phase, an element P a of c[ n ^ can be labeled by a length-2n binary vector 
a, where 

n n 

Pa = (g) X a > (g) Z a * + " , (21) 

i=l i=l 

and Oj denotes the ith component of a. Two elements P a and Pb of c[ n ^ obey the commutation relation 

PaPb = (-l) aAbT PbPa, (22) 
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where 

< 23 > 

here 0„ and I„ are the zero and identity n x n matrices respectively. 

For a length-n stabilizer code with k encoded qubits, the generating set Q = {G(l), G(2), . . . G(n — fc)} 
of the stabilizer 5 contains n — k independent commuting elements of c[ n ^ . The set Q can be represented by 
the (n — k) x 2n binary matrix 

/ .9(1) \ 
5(2) 

G^ . ; (24) 

V 9(n - k) J 

here g{i) is the binary vector labeling the Pauli operator G(i). Each G(i) squares to the identity, and has 
eigenvalues ±1 with equal degeneracy; thus the 2 n -dimensional n-qubit Hilbert space decomposes into 2™~ fe 
disjoint eigenspaces, each of dimension 2 k . Each subspace can be labeled by a (n — fc)-bit binary vector e, 
where the eigenvalue of G(i) is (— l) ei , for i = 1, . . . ,n — k. This vector e is called the syndrome of the 
subspace; the code space, which is the simultaneous eigenspace with eigenvalue one of all the generators, has 
syndrome e = (00 ... 0). Using Eq. (22), we see that the action of an Pauli operator P a on the code space 
changes the syndrome to the value e = aAG T . 

For a nondegenerate stabilizer code that corrects t errors, all P a with weight w < t take the code space 
to mutually orthogonal subspaces with distinct syndromes; thus, under the assumption that no more than 
t errors occured, each syndrome e points to a unique Pauli operator. In error correction, the syndrome e 
is measured, and then is applied, where P a is the unique Pauli operator with weight no more than t 
such that e = aAG T . If the code is degenerate, then P a may not be unique, but each P a of weight up to t 
satisfying e = aAG T is equally effective in correcting the error. 

7.2 Fault-tolerant error correction 

We wish to construct a level- 1 error correction gadget (1-EC) that satisfies properties 0-2. The key goal 
guiding the construction is property 2 — we must assure that an 1-EC with a single fault, when applied to 
an input with no errors, produces an output with only one error. 

7.2.1 Syndrome measurement using cat states 

The error correction consists of syndrome measurement followed by a recovery step. How can the syndrome 
be measured? Putting aside for the moment any concerns about fault tolerance, there is a general method 
for constructing a quantum circuit for measuring an n-qubit unitary operator U that has eigenvalues ±1. We 
may prepare a single ancilla qubit in the state |+) = (|0) + \ l))/y/2, and then apply U to a block of n-qubits, 
conditioned on the value of the ancilla (that is, I is applied if the state of the ancilla is |0) and U is applied 
if the state of the ancilla is |1)). Then we measure the ancilla qubit in the basis {|±) = (|0) ± |l))/\/2}. The 
outcome |+) for the ancilla measurement indicates that the eigenvalue of U is +1, and the outcome |— ) for 
the ancilla measurement indicates that the eigenvalue of U is — 1 . 

For the case of weight-it; Pauli operator, this circuit consists of the ancilla preparation, a sequence of w 
two-qubit gates (each a conditional Pauli operator acting on the ancilla and one of the qubits in the code 
block), and a final measurement of the ancilla qubit. This procedure is not fault tolerant, because the ancilla 
qubit interacts with multiple qubits in the same code block. A single fault could damage the ancilla qubit, 
and that error could propagate, infecting several of the qubits in the block. 

A better procedure is to replace the single ancilla qubit by an encoded ancilla block. For example, to mea- 
sure a weight-w Pauli operator, we could prepare the ancilla in the "cat" state |+) rcp = (I0)®" 1 + \l)® w )/\/2 
of the w-qubit quantum repetition code [1, 27]. Now each qubit in the ancilla block interacts with just one 
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qubit in the data block, so that if both the cat state and the input data block have no errors, a single fault 
in the circuit results in just one error in the data block. The final destructive measurement of the ancilla 
block in the basis {|±) rop } is equivalent to a measurement of X® w ; it can be achieved by measuring X on 
each of the w ancilla qubits and evaluating the parity of the outcomes. 

But our procedure must also include the fault-tolerant preparation of |+) rep - A complete procedure for 
measuring one bit of the syndrome is shown in Fig. 6 (where the Pauli operator being measured is X® 4 ). 
To prevent a single fault in the encoding circuit from causing multiple errors in the output data block, a 
verification step is included. If the outcome of the verification measurement is Z — — 1, then the cat state 
might have multiple errors — the state is discarded before it ever comes into contact with the data, and the 
preparation is repeated. If the verification succeeds, then at least two faults are required to introduce two 
or more errors into the data block. 
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Fig. 6. Cat-state method for fault-tolerant measurement of a Pauli operator, in this case X® 4 . A 
four-qubit cat state (|0)® 4 + |1)® 4 )/v2 is prepared and verified, then controls the application of 
the Pauli operator to the data block. All four ancilla qubits arc measured in the X basis, and the 
parity of the measurement outcomes determines the eigenvalue of the measured Pauli operator. 



To see why the procedure is fault-tolerant, it is important to understand how errors are propagated 
by CNOT gates. A CNOT propagates an X error "forward" from its control qubit to its target qubit, and 
it propagates a Z error "backward" from its target qubit to its control qubit. Thus while a Z error in 
the cat state might result in a faulty syndrome measurement, it is the X errors in the cat state that are 
especially dangerous, for these are the errors that might propagate into the data block. Since the ideal cat 
state is actually an eigenstate with eigenvalue one of X® 4 , it cannot be afflicted with more than two X 
errors; when two X errors occur, only one of the two qubits that participate in the verification test will have 
been affected, so that the test (if performed without faults) will detect the damage. It is easy to adapt the 
principles underlying the construction of this circuit to achieve a fault-tolerant measurement of any Pauli 
operator. 

Finally, after the complete syndrome e is measured, a weight- 1 Pauli operator is applied to correct the 
error. Since a single fault in the circuit might cause both an error in the data block and an error in a 
syndrome bit, the syndrome measurement should be repeated to ensure accuracy. If the same syndrome is 
found twice in a row, it is safe to accept the syndrome and recover from the error accordingly. (As we will 
explain in Sec. 8, for some fault-tolerant constructions, the error recovery step is not necessary at all; rather 
the propagation of errors through subsequent gates can be tracked by a classical computation.) Hence the 
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full procedure obeys property 2. 

7.2.2 Syndrome measurement using encoded blocks 

With the cat state method, a separate encoded ancilla is used to measure each of the code's stabilizer 
generators. Steane [28] proposed another fault-tolerant method for measuring the syndrome that is more 
highly parallelized. Stcane's method requires fewer gates than the cat state method, and is especially 
advantageous if the identity gate and nontrivial gates have comparable fault rates (that is, if storage errors 
are about as likely as gate errors). 

Steane's method applies to CSS codes [29, 30], for which each generator in Q can be chosen to be either 
a tensor product of Xs and Is (an X-type generator) or a tensor product of Zs and Is (a Z-type generator). 
For any CSS code with k = 1 encoded qubit, a CNOT gate can be implemented transver sally — an encoded 
CNOT is realized by performing (in parallel) a CNOT from each qubit in the control block to each qubit in 
the corresponding position in the target block [1, 31]. 

In Steane's method, the ancillas for the syndrome measurement arc encoded using the same CSS quantum 
error-correcting code as protects the data. To measure all of the Z-type generators, the ancilla is prepared 
in the encoded state |+) = (|0) + l))/-\/2, and an encoded CNOT is applied with the data block as control 
and the ancilla block as target. Since |+) is an eigenstate with eigenvalue one of X=NOT, the CNOT has no 
affect on the encoded state of the ancilla or the data, but the CNOT gates propagate each X error in the 
data block to the corresponding position of the ancilla block. Assuming, then, that the ancilla block had no 
errors of its own initially, and that none of the CNOT gates are faulty, the Z-type syndrome (which detects 
the X errors) can be extracted by measuring all of the ancilla qubits in the Z basis, and applying a classical 
parity check matrix to the outcomes. Specifically, for the Z-type stabilizer generator Z(b) = €3>™ =1 Z bi , if 
the outcome of the Z measurement is z = (z\, Z2, ■ ■ ■ , z n ), then the measured eigenvalue of Z(b) is b ■ z (mod 
2). Likewise, the X-type syndrome (which detects the Z errors) can be extracted by preparing the ancilla 
in the encoded state |0), applying an encoded CNOT with the ancilla block as control and the data block 
as target, measuring each qubit in the ancilla qubit in the X basis, and applying a classical parity check to 
the measurement outcomes. For the X-type stabilizer generator X(a) = <8>™ =1 X a \ if the outcome of the X 
measurement is x — (ari, X2, ■ . ■ , x n ), then the measured eigenvalue of X(a) is a ■ x (mod 2). 

The encoded |0) and |+) are prepared using encoding circuits that are not fault tolerant, and therefore 
a single fault during encoding might result in more than one error in the encoded state; as for the cat state 
method, a verification step is needed to enforce property 2. For the encoded |0), it is the X errors that 
might propagate from ancilla to data, and the ancilla should be rejected if the verification detects X errors. 
One way to conduct the verification is to prepare two blocks in the state |0), the ancilla block and a verifier 
block. A CNOT is applied with the ancilla block as control and the verifier block as target, the qubits in the 
verifier block are measured in the Z basis, and a parity check is applied to the measurement outcomes. In 
this case, not just the Z-type stabilizer generators are extracted; after a classical error correction step, the 
eigenvalue of the encoded Z (another Z-type Pauli operator) is also found. The ancilla block is rejected if 
the measurement of the verifier block detects a nontrivial syndrome or if the eigenvalue of Z is —1. The 
verification of the |+) ancilla can be conducted similarly. 

The syndrome measurement circuit, including the verification of the encoded ancillas, is shown in Fig. 7. 
A single fault during encoding of the ancilla block (not shown in Fig. 7) may propagate badly within that 
block. But if the ancilla is badly damaged, and there is only one fault in the complete syndrome measurement 
circuit, then the verification is perfect — the ancilla will be rejected, preventing the errors from propagating 
to the data. 

The syndrome measurement is followed by a recovery step, in which at most one X and at most one Z 
are applied to the data block. The recovery operation can be safely applied even if the syndrome is measured 
only once. A single fault during syndrome measurement might result in both an incorrect syndrome and 



24 



data 
block 



(a) 



10} 



|6>-$- 



data 
block 



(b) 



1+}^ 
l+>- 



Fig. 7. Steane's method for measuring the syndrome of a CSS code using encoded ancilla blocks, 
(a) For measurement of the X-type stabilizer generators, an ancilla block and a verifier block 
are prepared in the state |0). After a transversal CNOT from ancilla block to verifier block, the 
verifier block is measured and a classical parity check is computed to test the ancilla for X errors. 
If the ancilla is accepted, a CNOT is applied from ancilla block to data block, the ancilla qubits 
are measured in the X basis, and a classical parity check is computed to extract the measured 
eigenvalues of the X-type generators, (b) A similar procedure is used to measure the Z-type 
generators, with the ancilla and verifier prepared in the state |+), with CNOT gates acting in the 
opposite direction, and with qubits measured in the conjugate bases. 



an error in the data; however, using Steane's method, the incorrect syndrome affects the recovery operation 
only at position in the data block that is already damaged by the fault. Therefore, the faulty syndrome 
together with the error caused by the fault cannot combine to produce errors at two distinct positions in the 
data block, and the error correction procedure satisfies property 2. 

Note that this syndrome measurement procedure, like the cat state procedure described earlier, is non- 
deterministic — there are probabilistic fluctuations in the number of encoded blocks that must be prepared 
before an ancilla is successfully verified. There are some advantages to replacing the procedure with a deter- 
ministic one, in which a fixed number of encoded blocks are verified; for example, in that case we can make 
more definite statements about the overhead cost of the procedure. If we don't mind paying the price of 
making the accuracy threshold slightly worse, we can replace the nondeterministic procedure by a determin- 
istic one that uses more ancilla blocks. But since we are more interested here in obtaining a good estimate of 
the threshold rather than in minimizing the (worst-case) overhead, we will stick with the nondeterministic 
procedure in our analysis. The fact that ancillas are sometimes rejected has an impact on our calculation of 
the threshold, as will be explained in Sec. 8.3. 

7.2.3 Syndrome measurement using encoded Bell pairs 

Knill [13] proposed another fault-tolerant method for measuring the syndrome, based on quantum telepor- 
tation, that also uses encoded ancilla blocks. In this method, depicted in Fig. 8, a pair of ancilla blocks is 
prepared in the encoded Bell state |$o) = (jOO) + |ll))/-\/2. Then the data block together with one of the 
ancilla blocks is measured transversally in the Bell basis, and a Pauli operator is applied to the other ancilla 
block to complete the "teleportation" of the data. 

The outcome of the destructive Bell measurement is (P m ® J)|<I>o)® n , where P m is an n-qubit Pauli 
operator. If the ancilla has no errors, this outcome indicates an error syndrome e = mKG T for the data, which 
points to the recovery Pauli operator P r . Thus applying the operator P m ■ P r = P m+r to the unmeasured 
ancilla block implements the encoded Pauli operator needed for teleportation of the data block, and at the 
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Fig. 8. Error correction by teleportation. A pair of encoded ancilla blocks is prepared in the 
Bell state l^o); then a transversal CNOT and measurements in the X and Z bases are used to 
perform a destructive Bell measurement on one of the ancilla blocks and the data block, yielding 
the outcome (P m <8> -OI'J'o)®™, where P m is a Pauli operator. This outcome determines an error 
syndrome pointing to the recovery Pauli operator P r . Thus applying P m + r to the unmeasured 
ancilla block completes the teleportation of the data while also correcting errors. 



same time corrects the errors in the data. To ensure fault tolerance, the encoded Bell pair must be verified. 

While Steane's method applies only to CSS codes, Knill's method works for any stabilizer code. Further- 
more, it provides good protection against leakage errors, since leaked qubits are replaced when teleportation 
is carried out [32]. And Knill has shown [13] that this method, combined with other tricks, seems to yield 
especially favorable estimates of the threshold. It is an important open problem to derive a rigorous lower 
bound on the quantum accuracy threshold based on Knill's circuitry, but we will use Steane's method in our 
analysis in this paper. 

7.24 Properties 0, 0',and 1 

Property 1 merely says that a 1-EC with no faults can correct one error successfully, which is true for the 
procedures explained above. 

Property says that a 1-EC with no faults takes any input to the code space. For a perfect code, each 
value of the syndrome points to a unique correctable error. But even if the code is not perfect, we may by 
convention assign to each syndrome one particular Pauli operator that returns the subspace labeled by the 
syndrome to the code space. Therefore, if \tp) denotes a state in the code space, then for each Pauli error 
E a , the ideal 1-EC maps E a \ip) to O a \ip), where O a is some encoded operation that may depend on a. Any 
state of a data block can be purified, expressed as a pure state of the data and its environment E, and 
the ideal 1-EC can be expressed as a unitary transformation acting on the data and an ancilla A according 
to 

= ^ E a \$) ® \a) E ® \Q) A ^ Y, °°$) \ a ) E \ a ) A ' ( 25 ) 

a a 

where the states {|o).e} and {|a)^} of environment and ancilla are not assumed to be normalized or mutually 
orthogonal. After tracing over the data and ancilla, the state of the data block will be, in general, a mixture 
of pure states in the code space. 

Property 0' says that a 1-EC with one fault takes any input to a valid output (a superposition of states 
that each deviate from the code space by the action of a weight- 1 Pauli operator). Though property 0' holds 
for other error correction schemes as well, it is most easily explained for Steane's method. 

Suppose, for now, that the fault in the 1-EC circuit occurs during the syndrome measurement, not during 
the final recovery step. Because the circuit is fault tolerant, the fault affects the output data block in at 
most one position, and it also affects the outcome of the ancilla measurement in at most one position. 
Furthermore, when we consider all the possible fault locations in the 1-EC, and all the possible ways for a 
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Pauli error introduced by the fault to propagate through the 1-EC circuit, we conclude that the damage due 
to the fault occurs at the same position in the data block and in the ancilla block. Call this position i. 

Therefore, if the input data block has Pauli error E a , the error in the output data block is E^E a , where 
E^ is a Pauli operator with support at the position i. The syndrome measured by the decoder will be 
the syndrome associated with E^ E a , where E^ has support at position i, but it might be interpreted as 
the syndrome of another error Ed with recovery operator E\. However, since Ed and E^E a have the same 
syndrome, E^E^ ' E a = O a preserves the code space, and we may write Ed = E^ E a where E\E a = O a . 

Therefore, after the recovery operation is applied, the combined effect of the input error, the fault, and 
the recovery step on the data block is 

(E^E a ) ] (E?E a ) = (e\ (e™E®) E a ) (ElE a ) = E^O a , (26) 

where E^ is also a Pauli operator with support at position i. Thus, the output of the 1-EC deviates from 
the code space by the action of the weight-one Pauli operator E^ . 

We should also consider what happens if the fault occurs in the final recovery step. Then if the fault 
affects the Pauli gate that is intended to correct the error, the error might not be corrected successfully, but 
no other error will appear. Otherwise the fault is a storage error affecting one of the resting qubits in the 
final step. Then the fault does not interfere with the recovery Pauli operator, and the only error is the new 
one introduced by the fault. 

Finally, since any input state can be expanded in terms of states with Pauli errors, and any fault can 
be expanded in terms of Pauli faults, the 1-EC maps any input to a valid output, proving property 0'. A 
similar argument shows that for a code that corrects t errors, a 1-EC with s faults maps any input to an 
output that deviates from the code space by the action of weight-s Pauli operators. 

This observation completes our general discussion of how error correction gadgets satisfying properties 
0-2 are constructed. 

7.3 Fault-tolerant encoded gates 

Next we discuss how to construct a universal set of level- 1 gate gadgets obeying properties 3 and 4. For an 
efficient recursive simulation, we should choose a universal set of 0-Ga's that enables us to build a relatively 
simple fault-tolerant 1-Ga to simulate each 0-Ga in the set. 

The quest for a universal set of fault-tolerant gates can be guided by a helpful classification of n-qubit 
unitary transformations [22]. Wc have already discussed the n-qubit Pauli group C[ n \ The Clifford group 
contains the n-qubit unitaries whose action by conjugation maps Pauli operators to Pauli operators. 
The Clifford group is generated by the Hadamard gate H = (X + Z)/\/2, the phase gate S = U z (n/2) = 
exp(— ijZ), and the controlled-not gate CNOT = A(X). The set Cr 1 ^ (which is not a group for r > 3) contains 
the n-qubit unitaries whose action by conjugation maps Pauli operators to the set C^-i- Important elements 
of C3 that are not in Ci include the Toffoli gate A 2 (X), the conditional Hadamard gate A(H), the conditional 
phase gate A(5), and the single-qubit rotation T = U z (ir/4) = exp (— i\Z). The Clifford group is not 
dense in U(2 n ) — in fact quantum computation using Clifford group gates, Pauli operator measurements, 
and Pauli operator eigenstate preparations can be simulated efficiently with a classical computer [33]. But 
by adding to the generators of C2 an element of C 3 that is not contained in C 2 , we can obtain a universal 
gate set. 

Let us suppose that our 0-Ga's include the gates H, S, and cnot that generate C2, as well as the 
preparation of a qubit in the state |0) and a Z measurement. With these tools, for any stabilizer code we can 
build fault-tolerant level- 1 gadgets that realize the encoded Pauli operators, the preparation of the encoded 
state |0), and the measurement of encoded Pauli operators. (The measurement of encoded Pauli operators 
includes a classical parity computation which we assume to be reliable.) Furthermore, preparation of |0), 



27 



Pauli operators, and Pauli measurements are adequate for realizing each of H, S, and CNOT. By this route 
level- 1 C2 generators satisfying properties 3 and 4 can be constructed for any stabilizer code [31]. 

For some codes the construction of fault-tolerant C2 gates is particularly simple. We say that a 1-Ga is 
transversal if it is realized in a single time step by O-Ga's acting in parallel, such that all multiple-qubit O-Ga's 
act on qubits at corresponding positions in distinct blocks. Transversal gates are fault tolerant. Property 3 
is satisfied, because although an ideal 1-Ga might propagate an error from one block to another, it cannot 
propagate an error from one position in a block to another position in the same block. And property 4 is 
also satisfied, because a faulty O-Ga cannot cause two errors in the same block. 

Thus, in fault-tolerant simulations, it is advisable to use quantum error-correcting codes for which 
transversal gates can be constructed. For any CSS code with k = 1 encoded qubit, the encoded CNOT 
gate can be implemented transversally. And if the CSS code is constructed from a punctured doubly-even 
self-dual classical code, the encoded H and S gates are also transversal [1]. An example of a code with these 
properties is Steane's [[7, 1,3]] quantum code [16]; we will describe the details of the lcvcl-1 gadgets for this 
code in Sec. 8. 

But these generators of the Clifford group do not comprise a universal gate set, and indeed it does not 
appear to be possible, for any code, to construct a universal set of transversal 1-Ga's. Fortunately, though, 
there is a general scheme for using the Clifford 1-Ga's and a O-Ga in C3 to build a fault-tolerant 1-Ga that 
simulates the C3 gate [1, 22, 34]. This scheme involves the off-line preparation and verification of quantum 
software that is then consumed during the execution of the gate. 

The version of this scheme that we will use in our analysis is based on the circuit shown in Fig. 9. To realize 
the single-qubit rotation U z {6) = cxp (— an ancilla qubit is prepared in the state \Ag) = U z (9)\+). A 
CNOT gate is applied with the data qubit as control and the ancilla qubit as target, and then the ancilla 
qubit is measured in the Z basis. If the outcome of the measurement is |0), then U z {9) has been successfully 
applied to the data. If the measurement outcome is |1), then U z (—9) has been applied instead; in the latter 
case, the implementation of U z (9) can be completed by applying U z (29). 
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Fig. 9. Implementation of the gate U z (8). An ancilla is prepared in the state \Ag) = U z (0)\+), 
and a CNOT gate is executed with the data as control and the ancilla as target; then the ancilla 
is measured in the basis {|0}, |1)}. The gate U z (28) is applied to the data conditioned on the 
measurement outcome. 



Why is this circuit useful? Suppose that we have fault-tolerant 1-Ga's for the gates H, S, and CNOT 
that generate C2, and that we seek a fault-tolerant level- 1 gadget for the C3 gate T = U z (n/4) to complete 
a universal set. For 9 — 7r/4, U z (29) = S is a C2 gate, so in that case, all of the operations in Fig. 9 can be 
implemented fault tolerantly. To simulate the T gate at level 1, we replace the CNOT, the S gate, and the 
Z measurement by the corresponding 1-Recs. What is needed to complete the simulation is an additional 
1-Rec that reliably prepares (off-line) some suitably quantum software, namely the encoded state |j4 7r y 4 ). 

To enforce property 4, we must ensure that a single fault in the preparation circuit for [A^m) will not 
cause more than one error in the circuit's output. It is helpful to observe that \Ag) is an eigenstate with 
eigenvalue one of the operator U Z {9)XU Z (9)^ — U z (29) ■ X, which for 9 = ir/4 is the C2 operator SX. 

Thus the fault-tolerant realization of the T gate reduces to fault-tolerantly measuring the C2 operator 
SX, an operator for which a fault-tolerant 1-Ga can be constructed. This measurement can be accomplished 
using the cat state method. That is, we can measure an encoded operator U by preparing an ancilla in the 
(verified) state |+) TO p = (|0} rep + |l) rop )/v / 2 of the quantum repetition code, applying U to the data block 
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conditioned on the ancilla being in the state |l) re p; measuring the ancilla qubits in the X basis, and then 
computing the parity of the outcomes. We used this method previously in Sec. 7.2.1 to perform fault-tolerant 
measurements of encoded C\ operators, and it can also be used to measure encoded Ci operators, once we 
have constructed the fault-tolerant gadgets for the C2 operators. 

A single fault in the measurement circuit causes only one error in the measured block, but it might also 
cause an error in the measurement outcome. Hence, after an error correction, the measurement should be 
repeated. If the same result is obtained twice in a row, the ancilla block can be accepted safely. Finally, we 
have assembled all the elements of a 1-Ga for the T gate that satisfies properties 3 and 4. 

7.4 Fault-tolerant preparation and measurement 

For (destructive) measurements of encoded Pauli operators, properties 3 and 4 can be restated as: 

3. A 1-measurement with no faults applied to an input with one error agrees with an ideal measurement. 

4. A 1-measurement with at most one fault applied to an input with no errors agrees with an ideal 

measurement. 

Although the measurement is allowed to destroy the input block, for general stabilizer codes measurements 
must be repeated to enforce property 4, and therefore nondestructive measurements should be used. As 
discussed in Sec. 7.2.1, fault-tolerant nondestructive measurements of encoded operators can be executed by 
the cat state method; one fault in the measurement circuit causes just one error in the output block. But 
a single such measurement obeys neither property 3 nor property 4. Therefore, we extend the procedure 
— the first measurement is followed by the 1-EC, then a second measurement, another 1-EC, and finally a 
third measurement. The outcome of the 1-measurement is the majority of the results of the three individual 
measurements. 

If there are no faults, then the first 1-EC corrects the error in the input, and the second and third mea- 
surements both agree with the ideal outcome. If there is one fault, it could occur in one of the measurements 
or in one of the 1-ECs. But no matter where it is located, the fault can disturb the outcome of only one of 
the three measurements. Therefore properties 3 and 4 are satisfied. 

For CSS codes, an especially efficient destructive 1-measurement can be constructed, at least for measure- 
ment of the encoded operations X and Z. For example, to measure Z 7 which is a Z-type Pauli operator, all 
qubits are measured in the Z basis, and the results are postprocessed classically. A single erroneous outcome 
can be diagnosed by applying a classical parity check, the error can be corrected, and the eigenvalue of the 
encoded Z then evaluated. The error could be due to an error in the input, or due to a fault in one of the 
measurements — hence properties 3 and 4 are satisfied, if we assume that the classical processing is flawless. 
The same procedure can also be used to measure X. 

Since as far as the quantum processing is concerned the 1-measurement is built from only O-measurements, 
the recursively defined fc-measurement is also very simple. All of the qubits can be measured simultaneously 
in a single time step. Then the classical processing is done recursively. First each 1-block is (classically) 
decoded and a one-bit result recorded, then the process is repeated altogether k times to extract the final 
one-bit outcome of the measurement of the fc-block. 

For a 1-preparation, which has no input, we need only worry about property 4, which can be restated: 

4. A 1-preparation with at most one fault produces an output with at most one error. 

While encoding circuits typically propagate errors badly, we can build a fault-tolerant 1-preparation of the 
encoded |0) for any stabilizer code, using the cat state method for measuring encoded Pauli operators. 
Starting with an arbitrary state (such as a product of |0)'s), the 1-EC can be executed to obtain a state in 
the code space if there are no faults (property 0), or a valid state if there is one fault (property 0'). Then the 
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encoded Pauli operator Z can be measured nondestructively three times as described above. If a majority 
vote on the results finds the result |0), the preparation is finished, and if the result |T) is found, an encoded 
X is applied to complete the procedure. 

For CSS codes, there are more efficient 1-preparation procedures. The 1-preparation satisfies property 4 
in the CSS sense if its output has no more than one X error and no more than one Z error; we can enforce 
this property by subjecting an encoded block to a verification test, and rejecting the block when multiple 
errors are detected. This procedure is especially simple for a perfect CSS code like the [[7,1,3]] code, since in 
that case, we do not need to worry about more than one Z error in the encoded state |0) — for the [[7,1,3]] 
code, two Z errors are equivalent to an encoded Z and a single Z error, and the state |0) is an eigenstate 
of Z with eigenvalue one. We can check for multiple X errors just as in the verification test included in the 
Steane error correction circuit. We encode both the data block and an ancilla block in the state |0), then 
execute a CNOT gate from data to ancilla, and measure the ancilla in the Z basis. Applying classical parity 
checks to the measurement outcomes, we extract the data's X error syndrome, and after a classical error 
correction (if necessary), the eigenvalue of Z. If the outcome of the encoded Z measurement is |0), then the 
data block is accepted. Otherwise it is rejected. 

However, for a CSS code that is not perfect, we need to check for multiple Z errors as well as multiple X 
errors; this can be achieved with the circuit in Fig. 10, in which three encoded ancilla blocks are used for the 
verification. The first ancilla block detects multiple X errors in the data, as before. A second ancilla block 
is used to check for multiple Z errors. But if there is a fault in the encoding of the second ancilla block, it 
might have multiple X errors that could propagate to the data; to prevent this the second ancilla block is 
also verified, using the third ancilla block. 
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Fig. 10. 1-preparation of the encoded state |0), for CSS codes. Three ancilla blocks are used 
to verify that the data block contains no more than one X error and no more than one Z error. 
For the perfect [[7,1,3]] code, there is no need to check for multiple Z errors, and only one ancilla 
block is needed. 



8 Threshold estimate for the [[7,1,3]] code 

Now we will discuss explicit constructions of the level- 1 gadgets for a particular quantum error-correcting 
code, Steane's [[7,1,3]] code [16], the distance-3 CSS code with the smallest block size. By counting the 
malignant pairs of locations in the 1-exRecs as outlined in Sec. 6, we will obtain a lower bound on the 
quantum accuracy threshold eo, assuming an independent stochastic noise model. In our analysis, we will 
assume that classical processing of measurement outcomes is flawless and instantaneous. 

Our goal is to design gadgets that are as simple as possible, and so to obtain a good threshold estimate. 
We will be less attentive to overhead requirements, and in particular, the overhead of some of the gadgets 
is nondeterministic; encoded states are verified and are discarded if the verification fails. Thus there is a 
small probability that the verification step needs to be repeated many times before the state is accepted. 
We could modify the gadgets so that the overhead is deterministic, but the gadgets then become a bit more 
complicated and the threshold estimate worsens, so we will not discuss such modifications here. 
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The [[7,1,3]] code has the generator matrix 

G =(" v I) 

where 

/ 1 1 
H z = H x = 1 1 
\ 1 1 1 

The encoded X and Z can be realized by the weight-3 Pauli operators XXXIIII and ZZZIIII. The 
encoded generators of Ci can all be realized transversally — H by applying H to each of the seven qubits in 
the block, S by applying to each qubit, and the encoded CNOT by applying CNOT gates from each qubit 
in the control block to the corresponding qubit in the target block. A universal gate set can be completed 
by adding f, realized via fault-tolerant preparation of the state |Ar/ 4 ), followed by execution of the circuit 
shown in Fig. 9. 

8.1 Fault-tolerant gadgets 

We will suppose that our quantum computer is equipped with the following O-Ga's: preparation of |0) and 
|+), the gates {H, T, T\ CNOT}, and measurement of Z and X. There is some redundancy in this list — we 
could prepare |+) by first preparing |0) and then applying H, and we could measure X by first applying H 
and then measuring Z. But in our threshold analysis, we regard each listed O-Ga as a primitive location in 
our quantum circuit, where a fault can occur with probability e. (Note that T 2 = S, so the set is universal.) 
In a recursive simulation, we are to use these O-Ga's to realize the same set of primitive objects as 1-Ga's. 

The 1-preparation and 1-measurement are realized following the CSS constructions described in Sec. 7.4, 
and error correction is performed using the Steane method depicted in Fig. 7. The encoded ancilla is rejected 
whenever a non-trivial syndrome or the incorrect value of the encoded Z or X is obtained in the verification 
step. We will assume that many encoding and verification circuits are executed in parallel, so that a verified 
ancilla is always available when needed. 

As discussed in Sec. 7.2.2, although a single fault can cause both an error in the data block and an error 
in the syndrome measurement, the syndrome measurement need not be repeated to ensure fault tolerance. 
Furthermore, we will not need to include a recovery step in which a weight- 1 Pauli error is applied to correct 
the error. Instead, Pauli errors will be recorded in a classical register, and propagated through subsequent 
Ci gates by an efficient classical computation. No corrections are needed until encoded blocks are measured; 
at that stage, the recorded X errors are consulted to properly decode the outcome of a Z measurement, and 
the recorded Z errors are consulted to properly decode the outcome of a X measurement. This procedure 
works because the C3 gate T is realized through the off-line preparation and verification of quantum software; 
only C 2 gates, which propagate Pauli operators to Pauli operators, act directly on the data blocks. 

The level-1 C2 gate with the largest 1-cxRcc is the CNOT, whose 1-cxRec is shown schematically in Fig. 11. 
Contained within each 1-EC, but not shown explicitly in Fig. 11, are encoding circuits for the |0) and |+) 
states; the |0) encoder is shown in Fig. 12. Locations in the circuits where the qubits are "resting" and 
subject to storage faults are indicated by a thickening of the wires. 

To realize the gate T needed to complete our universal set, we prepare an ancilla in the encoded state 
1^/4) using the circuit shown in Fig. 13. This circuit employs the cat state method to twice measure 
the C 2 operator TXT^ = SX, which in the [[7,1,3]] code can be implemented transversally by applying 
S^X = T^XT to each qubit in the block. First an ancilla block is prepared in the state |0), and a seven- 
qubit cat state (|0)® 7 + |1)® 7 ) /y/2 is encoded and verified using the circuit in Fig. 14. Then the transversal 
XT is applied to the ancilla controlled by the cat state, all qubits of the cat state are measured in the 
X basis, and the parity of the measurement outcomes is evaluated classically to determine the outcome of 
the measurement of TXT^. Then the error syndrome is measured, and the ancilla is discarded unless the 
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Fig. 11. The CNOT extended rectangle for a CSS code. The encoding circuits that prepare |0) and 
|+) are suppressed here (the |0) encoder is shown explicitly in Fig. 12). Classical parity checks (PC), 
which arc assumed to be instantaneous and flawless, are performed on the measurement outcomes 
to diagnose errors in the data blocks. Diagnosed Pauli errors are not explicitly corrected; rather 
they are stored in a classical register and propagated through subsequent C2 gates by an efficient 
classical computation. Locations where storage faults can occur are indicated by thickened wires. 



syndrome is trivial (indicates no errors). Finally the whole procedure is repeated — TXT^ and then the 
error syndrome are measured a second time. The ancilla is accepted only if both measurements of TXT^ 
find the same eigenvalue, and if both measured syndromes are trivial. Repeating the measurement ensures 
the fault-tolerance of the ancilla measurement, and rejecting states with nontrivial error syndromes improves 
the fidelity of the preparation. (If the measured eigenvalue of TXT^ is —1, we can flip the eigenvalue by 
applying Z, or we can incorporate this Z into the Pauli error that is recorded and propagated classically.) 

The encoding of |0) is included in the 1^/4) preparation 1-exRec to ensure that, if there are no faults 
during encoding, the input to the encoded measurement is in the code space. It is not necessary to verify 
the encoded |0), since any other state in the code space would serve as well. A fault during encoding may 
propagate badly, but if there are no other faults, the resulting deviation from the code space will be detected 
subsequently and the prepared state will be rejected. 

8.2 Counting of malignant pairs: Procedure 

In Sec. 6, we described how to estimate the quantum accuracy threshold by counting the number of malignant 
pairs of locations within an extended rectangle. This counting is entirely combinatorial, and we have used a 
computer program written in Matlab to carry it out exhaustively. 

Two noise models were analyzed: The first is (adversarial) independent stochastic noise, where fault 
locations are independently and identically distributed, and once the fault locations are chosen the operations 
at those locations are arbitrary trace-preserving completely positive maps. Since all faults have a Pauli 
expansion, this model can be analyzed by testing Pauli faults at a pair of locations, and declaring the pair 
of locations to be benign if and only if the 1-Rec contained in the 1-exRcc is correct for all possible Pauli 
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Fig. 12. The level- 1 |0) encoding circuit for the [[7,1,3]] code. The |+) encoder is identical, except 
all CNOT gates are reversed in direction and the input states |0) and |+) are interchanged. Vertical 
lines separate successive time steps, and locations where storage faults can occur are indicated by 
thickened wires. (There is no storage fault in the first time step because one 0-preparation occurs 
a step behind the others.) 



faults at those locations. For this model, we can apply Theorem 2 to derive a rigorous lower bound on the 
quantum accuracy threshold Sq. 

The second noise model inserts the depolarizing channel at each location. Again the fault locations are 
independently and identically distributed, but now the fault at a bad location is assumed to be a Pauli 
operator, chosen equiprobably from among {X, Y, Z} for single-qubit gates, and equiprobably among the 
15 nontrivial Pauli operators for two-qubit gates. For this model, we can estimate a critical noise rate 
such that for noise rate e < a level-1 simulation outperforms an unprotected levcl-0 simulation. This 
estimate allows us to make a heuristic comparison between the adversarial and depolarizing noise models. 
But because the effective noise model that governs whether a fc-Rec is bad for k > 1 is not exactly self-similar, 
this critical noise rate cannot be identified as a rigorously established accuracy threshold. 

In both models, faults at a bad location are inserted right after the ideal implementation of the 0-Ga. 
For a faulty O-preparation, an X or Z is inserted right after the ideal preparation of |0) or |+), and for a 
faulty O-measurement, an X or Z is inserted right before an ideal measurement of Z or X. 

The investigation of whether a specified pair of locations is benign is conducted differently for the CNOT 
1-exRec, which contains only C2 0-Ga's, than for the \A n u) 1-exRec, which contains the C3 gates T and 
Tt. Since C2 gates propagate Pauli operators to Pauli operators, for the CNOT 1-exRec, input Pauli errors 
and Pauli errors caused by faults can be propagated through the circuit without ever leaving the Pauli 
group. When an ancilla qubit with an X error (say) is measured in the Z basis, a nontrivial syndrome bit is 
recorded. Finally, the Pauli error acting on the output of the 1-Rec is compared with the Pauli error acting 
on the input to test for encoded errors. 

The CNOT 1-Rec is certainly correct if both faults in the 1-exRec are contained in one of the leading 
1-ECs; therefore we may assume that each leading 1-EC has no more than one fault. This observation leads 
to a further simplification. We have already seen in Sec. 7.2.4 that a 1-EC with one fault takes an input with 
one error to a valid output, deviating from the code space due to an error at the position in the code block 
where the fault acted. Therefore, to investigate the consequences of a fault at a single location in one of the 
leading 1-ECs, we may disregard the input error while allowing an arbitrary fault at the specified location. 

We say that the {A^^) preparation 1-exRec succeeds if its output has no more than one error relative 
to the eigenstate found in the measurement of TXT^; otherwise it fails. A pair of locations is benign 
if the 1-exRec succeeds for arbitrary faults at those locations. To investigate whether a specified pair of 
locations is benign we consider arbitrary Pauli faults at the specified locations, propagate the errors to the 
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1-exRec 




Fig. 13. Preparation of an ancilla in the encoded state \A n / 4 ), which is needed for the execution 
of the encoded C3 gate T . The ancilla is prepared in the state |0), then a seven-qubit cat state 
is prepared and verified, and the cat state is used to measure TXTt. The error syndrome is 
measured and then the procedure is repeated. The ancilla is accepted if both measurements of 
TXT^ yield the same eigenvalue and both error syndromes are trivial. In this circuit T denotes, 
not the 1-Ga f , but rather the O-Ga T applied transversally to each qubit in the code block. The 
preparation and verification of the cat states, suppressed here, is shown in Fig. 14. 



output, and check whether the 1-exRec succeeds. However, an X error propagated through a T gate becomes 
TXT^ = (X + Y)j\f2 = X ■ (I + iZ)/\/2, which is not a Pauli operator. In our analysis, we pessimistically 
assume that when an X error propagates through a T or gate, the result is an X error accompanied by 
a potential Z error that can be turned on or off advcrsarially. 

A pair of locations in the I-A^m) 1-exRec is benign if both locations are in the first half of the circuit (the 
first TXT' measurement and the first 1-EC). In that case the second half of the circuit has no faults, so 
the state is a codeword if the second 1-EC detects no error, and the eigenvalue found by the second TXT^ 
measurement is correct. A pair of locations is also benign if exactly one of the locations is in the second 
1-EC. In that case, since there is only one fault at a location preceding the second 1-EC, the input to the 
second 1-EC has at most one error, and therefore, if no error is detected by the 1-EC, its output also has 
at most one error; furthermore, at least one of the two TXT^ measurements must be correct. By similar 
reasoning, we see that no T or T* gates belong to any malignant pair. A fault in one of these gates can 
cause a single error, but the error will be detected if the following 1-EC has no faults. On the other hand, 
if the following 1-EC does have a fault, then, if no error is detected, its output has at most one error. 

It may be that both fault locations are in the final 1-EC; in that case the Pauli errors can be propagated 
through the circuit and the output tested for an encoded error. All other malignant pairs contain at least 
one location in the encoding or measurement of a cat state. 

8.3 Counting of malignant pairs: Results 

In general, distinct O-Ga's are expected to have distinct fault rates. Let us designate the types of O-Ga's by 
the labels 1-8, as follows: 

1. rest during a gate cycle. 

2. rest during a measurement cycle. 
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1+) 

|0> 



|0) 



-0 



10) 



|0> 



+1 



Fig. 14. Encoding and verification of the seven-qubit cat state, needed for the preparation of 
the encoded ancilla state The state is rejected if the outcome of the measurement of the 

verifier qubit is |1), indicating multiple X errors that could have resulted from a single fault in 
the encoder. 



3. preparation of |0). 

4. preparation of |+). 

5. measurement of X. 

6. measurement of Z. 



7. cnot gate. 

8. T or Tt gate. 

We denote the probability of a fault at a location of type i by e^. Since the time needed to execute a gate 
might differ from the time needed for a measurement, we allow the rate E\ of storage faults during gate 
cycles to differ from the rate £2 of storage faults during measurement cycles. (We will not explicitly discuss 
the H O-Ga, which appears in the transversal H 1-Ga, but not in any other gadgets.) 

The 1-EC gadget contains 142 locations, which can be enumerated as follows: There are four encoders 
(18 locations each), four 1-measurements (7 locations each), two rests (7 locations each) and four encoded 
CNOT gates (7 locations each). Thus the cnot 1-exRec contains 575 locations: four 1-ECs (142 locations 
each) and one cnot (7 locations). This is the largest 1-exRec used in the level-1 simulation. 

To use the method of Sec. 6 to estimate the threshold, we need to be careful to take into account that 
ancilla blocks are used for syndrome extraction only after successfully passing a verification test. Therefore, 
in a recursive estimate of the failure probability for a fc-exRec, we should upper bound the failure probability 
conditioned on the acceptance of all ancilla blocks used in the k-exRec. Furthermore, this bound should be 
expressed in terms of failure probabilities for the (k— l)-cxRccs, also conditioned on the acceptance of all 
ancilla blocks used in the (k— l)-exRecs. In our enumeration of malignant pairs, it is convenient to count 
the pairs of locations such that suitable faults at those locations lead to the acceptance of all ancilla blocks, 
and cause failure of the fc-exRcc. Thus we obtain an upper bound on the joint probability of acceptance of 
all ancillas and failure of the k-exRec. After an appropriate adjustment, we can then find an upper bound 
on the conditional probability of failure, given acceptance of all ancillas. 
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Let ctij denote the number of pairs of locations of types i and j where faults can cause the CNOT 1-exRec 
to fail, while all ancilla blocks contained in the 1-exRec are accepted. Then, arguing as in Sec. 6, we find that 
the joint probability e^ 5 j oint of failure for the CNOT fc-exRec and acceptance of all ancillas can be bounded 

in terms of the failure rates {ej fc ^} for (fc— l)-cxRecs (conditioned on acceptance of all ancillas contained 
in these (k— l)-exRecs) as 



(29) 



j<i=i 



where ffmax 1 ' is the largest of the (conditional) failure rates in {e[ k and 



£?cnot = 



575 
3 



= 31,519,775 



(30) 



is the number of ways to choose three locations in the CNOT 1-exRec. We have computed a, 3 using the 
procedure outlined in Sec. 8.2. Represented as a lower triangular 7x7 matrix (the T and gates do not 
appear in the 1-exRec), the result is: 
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(31) 



To treat the case where the fault probability is e at every location, we add all entries in with equal 
weight, obtaining that the joint probability of failure and acceptance of all ancillas can be bounded as 



-7, joint 



< A 



CNOT 



.(fc-r 



+ BcNo T (e( fe - 1 >) 3 < ^CNOT^" 1 ^ 



(32) 



where A 



CNOT 



35, 235 malignant pairs, e 



(fc-i) 



< (^cnot) 

_(k) 



and A', 



CNOT 



36, 108, as in eq. (19). Now, to 



obtain an upper bound on the probability of failure £•). for the CNOT /c-exRec, conditioned on the acceptance 
of all ancillas, we can use Bayes' rule. Let P^) acccpt denote the probability that a level-fc encoded |0) or 
|+) block passes the verification test. Then, since eight ancilla blocks, prepared independently, are used in 
the four fc-ECs contained in the CNOT fc-exRec, we have 



.(*) 



= 



>(_fc) 

|0), accept 



■ w 

c 7, joint 



(33) 



To obtain a lower bound on P. 



(k) 



(k) 

10) accept' anc ^ nence an upper bound on e\ , we observe that for the ancilla 
to be rejected the encoding and verification circuit must contain at least one bad (k— l)-exRcc. This circuit 
contains C = 50 locations (18 for each of two encoders, 7 for the CNOT, and 7 for the measurement of the 
verifier block); therefore rejection occurs with probability no larger than Ce^ k ^ x \ and 
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where -Acnot w (1.0111)Aq NOT w 36,511, and £o = (^cnot) • Thus we obtain a rigorous lower bound 
on the accuracy threshold (assuming that the CNOT exRec dominates the threshold estimate): 

s > 2.739 x 1CT 5 . (36) 

If we assume that storage faults are negligible, we do not need to count the rests in the circuit, reducing 
the total number of locations in the CNOT 1-exRec to 487, so that £?cnot = 19, 131, 795; furthermore, we 
can set £i = £2 = 0, eliminating the entries in the first two rows and columns of a, and the number of 
locations in the ancilla encoding/verification circuit is reduced to C = 46. We then find Acnot = 22, 701, 
A' CNOT w 23, 515, ^cnot ~ 23 ' 887 > and £ o ~ 4.186 x 1(T 5 . 

The matrix is informative about how fault locations inside the CNOT exRec combine to lead to a 
failure. For example, the malignant pairs of CNOT faults (counted in 077) are 37% of the total, and the 
malignant pairs containing at least one CNOT gate (the last row of a) are 85% of the total. In contrast, 
storage faults belong to only 36% of the total number of malignant pairs, reflecting the highly parallelized 
nature of Steane's error correction method. 

We need to check that the CNOT 1-exRec really dominates the accuracy threshold. Its strongest com- 
petitor is the \A n /4,) preparation 1-exRec shown in Fig. 13. To count the locations in this exRec, we must 
keep in mind that at level 1 and above the T gate is executed using the circuit in Fig. 9, which should be 
regarded as four locations (the \A n /4,} preparation, the CNOT, the measurement, and (if necessary) the S 
gate). Therefore the lA^) preparation exRec has 521 locations: there are two 1-ECs (142 locations each), 
two circuits for cat state encoding and verification (36 locations each), two T gates and two gates (28 
locations each), two CNOT gates from cat state to data block (7 locations each), two cat state measurements 
(7 locations each), a |0) encoding circuit (18 locations), and a rest at the end of the 1-exRec during the final 
measurement that determines whether the ancilla is accepted (7 locations). 

The 1^/4) preparation exRec uses two cat states and four encoded ancillas, all of which must pass 
verification tests, and furthermore the output of the exRec is accepted only if the same measured eigenvalue 
is obtained twice in a row. As in our analysis of the CNOT exRec, the counting of malignant pairs in the 
IAr/4) preparation exRec provides an upper bound on the joint probability of failure of the exRec and a 
successful outcome in all verification tests. Let denote the number of pairs of locations of type i and 
j where faults can cause the 1^/4) preparation 1-exRec to fail, while all verification tests succeed. (This 
matrix is 7 x 7, because it turns out that the T and locations do not belong to any malignant pair.) 
Then, the joint probability £g fc ^ oint of failure for the \A^^ 4 ) fc-exRec and success in all verification tests can 

be bounded in terms of the failure rates {s[ k1 ^} for the (k— l)-exRecs as 

4 fe U < E to et^ef^ + B ]A „ /i} (e^f, (37) 
]<i=i 

where £max^ is the largest of the failure rates in {s\ k '^}, and 

S|^ /4 > =( 5 f ) - 23,434,580 (38) 

is the number of ways to choose three locations in this 1-exRec. Represented as a lower triangular 7x7 
matrix, the result of our computation of f3ij is: 
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(39) 



To treat the case where the fault probability is e at every location, we add all entries of with equal 
weight, finding A\ Ait/ ^ = 2, 330 and A'^ A ^ 6, 144. In order for any verification to fail, there must be at 
least one bad (k— l)-cxRcc in the fc-exRec. Therefore, all verifications succeed with probability 



^succeed — ^ 



where D = 521 is the number of locations in the exRec, and we have 



(1.0927)Af 



(40) 
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36,511, the cnot 1-exRec does indeed 



Where A \A^) = ( L0927 ) A U. /4 ) « 6 ' 713 - SinCC A \A^) 

determine our lower bound on the quantum accuracy threshold. Therefore, we have proved: 
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^ ^CNOT 



Theorem 3. Lower bound on the quantum accuracy threshold. Suppose that independent stochastic 
faults occur with probability e at each location in a noisy quantum circuit. Then for any fixed 6, any ideal 
circuit with L locations can be simulated with error 8 or better by a noisy circuit with L* — L(polylog L) 
locations, provided that 

e < £ = 2.73 x 10" 5 . (42) 



We note that if we merely counted the number of pairs of locations in the CNOT 1-exRec and used Theorem 
1, we would estimate £o = ( 5 ^ 5 ) ~ 6.06 x 10~ 6 (times a small correction arising from the acceptance 
probability of the ancillas). By counting malignant pairs and using Theorem 2 instead, we have improved 
the lower bound on the accuracy threshold by a factor of about 4.5. 

We have also analyzed the performance of our level-1 gadgets against depolarizing noise. For a specified 
malignant pair of fault locations, there are some pairs of Pauli errors that do not cause failure. In the 
analysis, we weight each of the three Pauli errors at a single-qubit location of type i by £j/3, and we weight 
each of the fifteen Pauli errors at a cnot location by £7/15. Preparation and measurement errors occur with 
probabilities, |£3, |e4, ^Eq. We then find that the joint probability that the cnot 1-exRec fails and all 
ancillas are accepted is 
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Again, we can treat the case where the fault probability is e at every location by adding all entries in a>ij 
with equal weight, obtaining Acnot ~ 7, 183 malignant pairs, A' CNOT rj 10, 256, and A CNOT s=a 10, 665. The 
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value A" A ^ w 6, 713 for the \A n /ij preparation with adversarial noise is already smaller than our value of 
^cnot f° r depolarizing noise, so there is no need to reanalyze the |A x / 4 ) preparation for depolarizing noise. 
Thus we find that e^p < e for e < (-4'cnot)" 1 = e o*de P oi' or 

4!dc P oi » 9.376 x 1CT 5 , (45) 

an improvement over our estimate of So by about a factor of 3.4. We emphasize, though, that in contrast to 
our calculation of £o for adversarial independent stochastic noise reported in Theorem 3, this calculation of 
e[ ) 1 depol is not a rigorous lower bound on the accuracy threshold for depolarizing noise. Rather, e^depoi ^ s a 
lower bound on the critical fault rate for which the level- 1 simulation is more reliable than the unprotected 
level-0 circuit, and at best a rough indication of how effectively a recursive simulation protects against 
depolarizing noise compared to adversarial noise. (If we assume that storage faults are negligible (s\ = e 2 = 
and C = 46), then ^cnot ~ 3, 880, A^, NOX « 6, 725, A£ NOT w 7, 105, and ejdepol « 1.407 x 10~ 4 .) 

Higher estimates of the accuracy threshold have been reported based on numerical and heuristic studies 
of fault-tolerant gadgets for the [[7,1,3]] code [35, 12, 10, 14, 13]. We emphasize again that our lower bound 
on the quantum accuracy threshold (Theorem 3), in contrast to those previous estimates, has been rigorously 
proven. 

9 Higher-distance codes 

As we have formulated it, the quantum threshold theorem is premised on the existence of a universal set 
of level- 1 gadgets that satisfy the property exRec-Cor. Our proof shows that exRec-Cor is also satisfied at 
higher levels, and that bad fc-exRecs are very rare. 

In Sec. 2 and 3, we stated conditions on the level- 1 gadgets that suffice to ensure that exRec-Cor is 
satisfied, and those conditions guided our explicit gadget constructions in Sec. 7 and 8. Now we will revisit 
these conditions and state them in a different language. 

Our motivation is two-fold. First, the original statement of the conditions employs the somewhat vague 
notion that an input or output block has "at most one error." But as we have emphasized previously, 
since the qubits comprising an encoded block are highly entangled, an error damages not the local state of 
the qubit but rather its correlations with other qubits. Therefore, it is preferable to characterize gadgets 
syntactically rather than semantically — that is, to speak of the properties of operators rather than the 
properties of states. Our restated criteria are syntactic. Secondly, it will be especially easy to apply these 
criteria to quantum codes that can correct multiple errors in a block. 

For a code that can correct t errors, we will say that a 1-exRec is good if it contains no more than t 
faults. As before, a 1-Rec is correct if the 1-Rcc followed by the ideal 1-decoder is equivalent to the ideal 
1-decoder followed by the corresponding ideal 0-Ga. At level 1, exRec-Cor says that the 1-Rec contained in 
a good 1-exRec is correct; that is, for a good 1-exRec: 



1-EC 




1-Ga 




1-EC 




ideal 
1-decodcr 




1-EC 




ideal 
1-decodcr 




0-Ga 



We will state some (syntactic) properties of 1-gadgets, and show that these properties imply exRec-Cor 
at level 1. These properties do invoke the condition that the input or output of a gadget is not too badly 
damaged, but in a form that can be stated syntactically. Namely, it is enough to characterize how a state 
deviates from the code space, rather than its deviation from an ideal state in the code space. For this 
purpose, we introduce the concept of a filter. An s- filter is the orthogonal projection onto the space spanned 
by all states that can be obtained by acting on a codeword with a Pauli operator of weight no larger than 
s. We will say that a gadget is "r-good" if it contains no more than r faults. 
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The properties satisfied by the 1-gadgets are more precise formulations of the properties that we stated 
earlier for distance-3 codes (t = 1), and they also generalize those properties to larger values of t. These 
properties are as follows. 

Property 0: 



r-good 




r-good 




r-filter 


1-EC 




1-EC 





(r<t) 



Property 1: 



— s-filtcr — 



r-good 
1-EC 



ideal 
1-clecoder 



— — — s-filter — 



ideal 
1-decodcr 



(r + s<t) 



Property 2: 



{si}-filters 



r-good 
1-Ga 



{s,}-filters 



r-good 
1-Ga 



s-filter 



(* = r + Ei*i<*) 



Property 3: 



■ {si}-filters- 



r-good 
1-Ga 



ideal 
1-decoder 



{si}-filters 



ideal 
1-decoder 



ideal 
0-Ga 



(r + Ei «i < *) 



For properties 2 and 3, in the case where the 1-Ga has more than one input block, by {si}-filters we mean 
that the Sj-filter is applied to the ith input block, while the s-filter on the right means that each output 
block deviates from the code space by at most distance s. 

There are also special versions of the properties that hold for the 1-preparation and the 1-mcasurcment. 
For the 1-preparation, property 2 becomes: 



r-good 




r-good 




r-filter 


1-prep. 




1-prep. 





and property 3 is: 



r-good 




ideal 




ideal 


1-prep. 




1-dccodcr 




0-prep. 



(r<t) 



For the 1-measurement, which has only a classical output, there is no analog of property 2, but property 3 
becomes: 



s-filter 




r-good 
1-meas. 




s-filtcr 




ideal 
1-dccodcr 




ideal 
0-meas. 



(r + s<t) 



(Here we assume that the classical post-processing of the measurement outcomes is flawless.) 

Though we will not discuss the details, by following the principles of quantum fault tolerance, gadgets 
with these properties can be constructed for a stabilizer code that corrects t errors. Property says that a 
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1-EC with r faults takes any input to an output that deviates from the code space by no more than distance 
r, and property 1 expresses that a 1-EC with r faults takes an input with s errors to an output that can still 
be successfully decoded. Property 2 says that a 1-Ga does not propagate errors too badly: if the 1-Ga has 
r faults, then the number of errors in each output block is not more than r plus the total number of errors 
in all input blocks; Property 3 says that these output blocks can be accurately decoded. 

For the case t = 1, these properties capture (syntactically) the content of the properties stated in Sec. 2 
and 3. The new property replaces the old properties and 0', the new property 1 replaces the old properties 
1 and 2, the new properties 2 and 3 replace the old properties 3 and 4. 

Using these new properties, we can establish: 

Lemma 5. exRec-Cor at level 1 (for a code that corrects t errors). Suppose that the level- 1 gadgets 
obey properties 0-3. Then the 1-exRecs obey exRec-Cor. 

Proof: To simplify notation, we will consider the case of a singlc-qubit gate — the argument is essentially 
the same for a multi-qubit gate. A 1-exRec is good if it contains no more than t faults. Suppose there arc s 
faults in the leading 1-EC, r faults in the 1-Ga, and s' faults in the trailing 1-EC, with s + r + s' < t. Then 
the 1-exRec followed by the ideal 1-decoder can be expressed as 



s-good 
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ideal 


1-EC 
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1-EC 




1-decoder 



Using property we can insert a filter after the leading 1-EC, and using property 2, we can insert a filter 
after the 1-Ga: 



s-good 
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r-good 
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Using property 1, we can omit the trailing 1-EC, then using property 2, we can omit the filter that precedes 
it: 



s-good 
1-EC 




s-filter 




r-good 
1-Ga 




ideal 
1-decoder 



Finally, using property 3, we can move the decoder to the left, converting the 1-Ga to the ideal 0-Ga, and 
using property 0, we can remove the filter that precedes it, thus obtaining 



s-good 




ideal 




ideal 


1-EC 




1-dccoder 




0-Ga 



This proves exRec-Cor. 

□ 

Our proof of the threshold theorem applies without much modification to codes that correct t > 2 errors. 
To define badness at level k + 1, we need a notion of what it means for two successive overlapping /c-exRecs 
to fail independently. Again, we are guided by the goal of proving exRec-Cor inductively via the threshold 
dance. As the ideal fc-decoders sweep to the left through a (fc+l)-exRec, they hop over a full fc-exRcc 
that contains more than t bad (k— l)-exRccs. Then whether the preceding gate in the circuit is simulated 
accurately is determined by whether the /c-ECGa contains more than t faults. Thus, we may say that a 
(/c+l)-exRec is bad if it contains t+1 independent bad fc-exRecs, where two successive bad exRecs are said 
to be independent only if the earlier /c-ECGa contains t+1 independent bad (k— l)-exRecs. 
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With this definition of goodness we can prove exRec-Cor at level k inductively. Furthermore, since a 
&;-exRec is bad only if it contains t+1 (k— l)-exRecs that fail independently, the probability that a 
fc-cxRec is bad satisfies 

e (*) < A (eV°-») t+1 , (46) 



where A is the number of ways to choose t+1 locations in the largest exRec. This inequality can be rewritten 
as 



where 

which can be iterated to find 



/so < (e^-V/so) , (47) 
£o = A- 1 ' 1 , (48) 



e {k) <e (e/e a f +1) . (49) 
Arguing as in Sec. 4 we arrive at a generalization of Theorem 1. 

Theorem 4. Quantum accuracy threshold for independent stochastic noise (using a code that 
corrects t errors). Suppose that fault-tolerant gadgets can be constructed such that all 1-exRecs obey the 
property exRec-Cor, where a 1-exRec that contains no more than t faults is said to be good. Suppose that I 
is the maximal number of locations in a 1-Rec, d is the maximal depth of a 1-Rec, and Eq is the maximal 
number of ways to choose t+1 locations in a 1-exRec. Suppose that independent stochastic faults occur with 
probability e < £o at each location in a noisy quantum circuit. Then for any fixed S, any ideal circuit with L 
locations and depth D can be simulated with error 5 or better by a noisy circuit with L* locations and depth 
D* , where 

L* = 0(L(logL) a ) , D* =0{D{\ogLf) , (50) 

and 

log I n log^ (KU 

a = ioi^Tir /3= toi(t+i)- (51) 



As in Sec. 6, we can improve the threshold by counting malignant sets of t + 1 locations, thus obtaining a 
generalization of Theorem 2. 

10 An alternative proof 

The proof of Theorem 1 that we presented in Sec. 4 applies to the case of concatenated distance-3 codes, 
and also, as extended in Theorem 4, to concatenation of codes with larger distance. Here we will present 
another proof, really an elaboration of the proof in [2] , which applies to concatenation of codes with distance 
5 or more, but not to codes with distance 3. 

As in our previous proof, the key is to define notions of goodness and correctness so that "good implies 
correct" is satisfied by fault-tolerant level- 1 gadgets, and can also be established for higher-level rectangles 
by an inductive argument. Again, whether a rectangle simulates an ideal gate accurately depends on the 
context, on what precedes the rectangle in the circuit. In our previous argument, the appropriate context 
was established by considering the extended rectangle that contains the rectangle. Now the context will 
be provided by considering the output from the preceding rectangle. With this approach, we avoid the 
complication of dealing with extended rectangles that overlap with one another. But other complications 
arise instead. 

Since many of the same ideas are used in this proof as in the proof of Theorem 1, we will be sketchier 
than before, emphasizing the parts of the argument that require modification. 
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O-Ga 



— 1-ECH — 
1-Gc 

— i-ecH — 



Fig. 15. Levcl-1 simulation for a distance-5 code. Each O-Ga in the ideal circuit is replaced by 
a 1-Rcc, which consists of the 1-Ga that simulates the O-Ga, preceded by a 1-EC acting on each 
input 1-block of the 1-Ga. 



10.1 Properties of the level-1 gadgets 

In our levcl-1 simulation, a O-Ga of the ideal circuit will be simulated by a 1-Rec that is composed of the 
appropriate 1-Ga preceded by 1-ECs acting on each input 1-block, as shown in Fig. 15. (It is useful for the 
1-ECs in the 1-Rec to act before the 1-Ga, so that we can formulate and use the property Rec-SemiCor, 
stated below.) 

Because we are now using a quantum error-correcting code that can correct two errors, it is possible by 
following the principles of quantum fault tolerance to construct a universal set of 1-Recs with the property: 

Rec-Cor at level 1. If a 1-Rec contains no more than one fault, and its input has no more than one error 
in each input block, then its output has no more than one error in each output block. 

We will say that a 1-Rec is good if it contains no more than one fault, and that a 1-Rec is correct if it takes 
an input with no more than one error per 1-block to an output with no more than one error per 1-block. (A 
preparation 1-Rec is correct if its output has no more than one error, and a measurement 1-Rec is correct 
if it successfully measures a 1-block that has no more than one error.) Then the property Rec-Cor can be 
stated more succinctly as 

Rec-Cor at level 1. A good 1-Rec is correct. 

For independent stochastic faults that occur with probability e at each circuit location, a level-1 simulation 
of an ideal circuit with L locations will be successful if every 1-Rec is good. Therefore, we may again bound 
the probability of failure of the simulation using eq. (3), but where now A is the number of pairs of locations 
in the largest 1-Rec. 

1 0.2 Recursive simulation: validity, goodness, and correctness 

The level of the simulation is advanced by one level by replacing each O-Ga by a 1-Rec. We define goodness 
for a k-Rec in the obvious way — it is good if it contains no more than one bad (k— 1)-Rec. Thus the 
probability that a k-Rec is bad satisfies eq. (6), where £q 1 is the number of pairs of locations in the largest 
1-Rec. 

Defining correctness is more subtle. In order to carry out the inductive step smoothly, we will define 
correctness using an ideal A:-decoder, but we also wish to capture the idea that a correct fc-Rec simulates 
an ideal gate accurately if each input fc-block has no more than one (k— l)-block that is badly damaged. 
For this purpose, we will introduce a notion of validity for /c-blocks. A 1-block is valid if it deviates by no 
more than distance 1 from the code space, and a /c-block is valid if its deviation from the code space can be 
attributed to errors that are sparse in a certain sense. 

To make these notions precise, we need to define our ideal /c-decoder properly. The ideal 1-decoder 
operates as follows: First, it measures and records the error syndrome. If the syndrome indicates a single 
error in the 1-block, that error is corrected, and then the 1-block is decoded to a qubit. But if the syndrome 
indicates more than one error, the 1-block is decoded as a random qubit (a maximally mixed state). Finally, 
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the syndrome is discarded. Note that although the code is capable in principle of correcting two errors, the 
ideal decoder does not fully exploit the code's error-correcting power. 

The \eve\-k decoder is defined recursively; it is realized by first applying the (fc— l)-decoder to each of 
the (k— l)-subblocks, and then applying the 1-decoder to the resulting 1-block. 

We say that the state of a fc-block is valid if, with probability one, the outcome of the syndrome mea- 
surement indicates no more than one error at the top level. That is, if the ideal fc-dccodcr is viewed as 
(k— 1) decoders acting on all (k— l)-subblocks, followed by a final 1-decoder, then a fc-block is valid if, with 
probability one, the final 1-decodcr finds a syndrome pointing to no more than one error. Thus, validity is a 
property of a state that can be checked locally, at least in principle — whether a state of a fc-block is valid 
does not concern the nature of the entanglement of the fc-block with other fc-blocks. In fact, valid states 
form a linear subspace of the Hilbert space of the fc-block, and we can define a validity filter, an orthogonal 
projector onto this subspace. 

With our definition of validity in hand, we can formulate what we mean by correctness. 

Definition. Correctness. A k-Rec is correct if, acting on a valid input, the k-Rec followed by the ideal 
k-decoder is equivalent to the ideal k- decoder followed by the ideal O-Ga that the k-Rec simulates: 



valid input — 



correct 




ideal 


fc-Rec 




fc-decoder 



ideal 




ideal 


fc-decodcr 




O-Ga 



(When we say that the input is valid, we mean that each input block is valid.) Preparation and measurement 
are special cases. A fc-preparation is correct if the ideal fc-decoder maps its output to the ideally prepared 
state: 



correct 




ideal 




ideal 


fc-prep. 




fc-dccodcr 




0-prep. 



and a fc-measurement is correct if, acting on a valid input, it realizes the same POVM as the ideal fc-decoder 
followed by the ideal O-measurement: 



valid input — 



correct 
fc-meas. 



valid input — 



ideal 




ideal 


fc-decoder 




0-meas. 



To prove the threshold theorem, we will need two properties of fc-Recs: 
Rec-Cor. A good k-Rec is correct. 

Rec-Val. For any input, the output of a good k-Rec is valid. 

(When we say that the output is valid, we mean that all output blocks are valid.) If both properties hold, 
then the level-fc fault-tolerant simulation of an ideal quantum circuit will succeed if all fc-Recs are good. 
To reach this conclusion, we first use Rec-Val to infer that every fc-Rec has a valid output. Since the final 
measurements are correct and have valid inputs, we can replace them by ideal fc-decoders followed by ideal 
O-measuremcnts. Now since all fc-Recs simulating gates are correct and have valid inputs, we can move the 
fc-decoders to the left through the circuit one step at a time until they reach the fc-preparations. Since the 
fc-preparations are correct, the ideal fc-decoders transform them to ideal O-preparations. 

For distance-5 codes, we can build fault-tolerant 1-gadgets such that Rec-Cor and Rec-Val are true. Now 
we need to show that they are true at level k if they are true at level k — 1. To carry out the inductive step, 
it is useful to introduce a third property. We define 
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Definition. Semi-correctness. A k-Rec is semi-correct if, for any input, the k-Rec followed by the ideal 
k-decoder is equivalent to the k-EC contained in the k-Rec, followed next by the ideal k-decoder and then by 
the ideal O-Ga that the k-Rec simulates: 



k-EC 




k-Ga 




ideal 
fc-decoder 




fc-EC 




ideal 
fc-decoder 




ideal 
O-Ga 



And the third property is 
Rec-SemiCor. A good k-Rec is semi- correct. 

Our fault-tolerant 1-gadgets can also be constructed to have the property Rec-SemiCor. Thus, to complete 
our alternative proof of the threshold theorem we need 

Lemma 6. Good implies correct. Suppose that all 1-Recs satisfy properties Rec-Cor, Rec-Val, and 
Rec-SemiCor. Then Rec-Cor, Rec-Val, and Rec-SemiCor hold for all k-Recs at each level fc > 1. 

We assume the induction hypothesis, that Rec-Cor, Rec-Val, and Rec-SemiCor hold at level-fc. We are 
to prove all three properties at level- (fc+1). As in Sec. 5, the proofs exploit the recursive construction of 
the ideal (fc+l)-decoder — it can be regarded as fc-decoders acting on all fc-subblocks, followed by a final 
1-decoder. Using the induction hypothesis, we aim to move the fc-decoders step-by-step to the left through 
the (fc+l)-Rec, transforming it to a good 1-Rec, and then use the properties of the 1-Recs to complete the 
proof. 

By Rec-Val at level k, each good fc-Rec has a valid output. Therefore, using Rec-Cor at level k, the 
fc-decoder can be moved one step to the left past a good fc-Rec that follows other good fc-Recs, converting 
the k-Rec to an ideal O-Ga. 

But as in Sec. 5, the key to the argument is to show that a bad k-Rec gets transformed to a faulty O-Ga 
as the fc-decoder moves to the left. A good fc-Rec that follows a bad fc-Rec may not have a valid input, 
so we cannot use Rec-Cor to justify moving a fc-decoder to the left through the fc-Rec. Instead, we use 
Rec-SemiCor, and move the fc-decoder through the fc-Ga, transforming it to an ideal O-Ga. 

Now the bad fc-Rec is accompanied by its trailing fc-ECs, forming what we will call a bad fc-RecEC. 
We would like to move the fc-decoders left though the fc-RecEC, transforming it to a faulty O-Ga. But 
we encounter the same technical glitch that we confronted in Sec. 5, and we overcome the difficulty in the 
same way, by replacing the fc-decoders by fc-*decoders that retain their output syndromes. Then, as in that 
previous discussion, moving fc-*decoders to the left past a good fc-Rec that follows good fc-Recs transforms 
the fc-Rec to an ideal O-Ga (using Rec-Val and Rec-Cor at level fc), tensored with an operation that acts 
only on the syndrome. Similarly moving fc-*decoders to the left past the fc-Ga contained in a good fc-Rec 
transforms the fc-Ga to an ideal O-Ga (using Rec-SemiCor at level fc), tensored with an operation that acts 
only on the syndrome. Furthermore, moving fc-* decoders left past a bad fc-RecEC transforms it to a faulty 
O-Ga that acts jointly on its input qubits and on the syndrome. 

In the proof of Rec-Cor at level fc + 1, we are to assume that the input blocks of the (fc+l)-Rec are 
valid. If a (fc+l)-block is valid, that means that if fc-decoders are applied to all the fc-subblocks, then with 
probability one at most one of the fc-subblocks has a syndrome indicating two or more errors at the top 
level. Therefore, a valid (fc+l)-block can be expanded, such that for each term in the expansion no more 
than one fc-subblock is invalid. In the proof of Rec-Cor below, we will glibly say that the valid (fc+l)-block 
has no more than one invalid fc-subblock. But what we are really doing is considering one fixed term in this 
expansion; since the proof works for each such term, it works for arbitrary valid input (fc+l)-blocks. 

Proof of Rec-Cor: The (fc+l)-decoders placed behind the (fc+l)-Rec can be realized as fc-*decoders applied 
to all fc-subblocks of the output (fc+l)-blocks, followed by final 1-decoders (where the output syndromes of 
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the fc-*decoders are discarded). As described above, using Rec-Val, Rec-Cor, and Rcc-SemiCor at level fc, 
we can sweep the fc-*decoders that follow the (fc+l)-Rec to the left one step at a time until they reach the 
front of the (fc+l)-Rec; thereby we transform each good fc-Rec to an ideal O-Ga, and each bad fc-Rec to a 
faulty O-Ga. 

The input to the (fc+l)-Rec is valid; therefore, each input (fc+l)-block has at most one invalid fc-block. 
One of the fc-Recs in the first time step of the (fc+l)-Rec acts on the invalid input fc-subblock. If this fc-Rec 
is good, then we may use Rec-SemiCor at level k to move the fc-*decodcr to the left through the fc-Ga, 
transforming the fc-Ga to an ideal O-Ga. If it is bad, we can move the fc-*decoder to the left through the bad 
fc-RecEC, transforming the RecEC to a faulty O-Ga. 

Since the (fc+l)-Rec is good, after we have moved all the k decoders to the left, they are followed by a 
good 1-Rec, and then by final 1-decoders. Furthermore, the input to the 1-Rec is valid. 

Using Rec-Cor at level 1, we can move the 1-decoders to the left past the 1-Rec, transforming it to an 
ideal O-Ga. Now the (fc-fl)-decoders act on the input (fc+l)-blocks, except for a possible fc-EC acting on 
the one invalid fc-subblock of each input (fc+l)-block, preceding the action of the (fc+l)-decoder on the 
input. But the input (fc+l)-block is valid, and therefore an operation acting on its one invalid block does 
not interfere with the action of the decoder — the fc-EC is of no consequence and can be removed without 
altering the output of the (fc+l)-decoder. Thus we have shown Rec-Cor for the (fc+l)-Rec. 

□ 

Proof of Rec-SemiCor: The (fc+l)-decoders placed behind the (fc+l)-Rec can be realized as fc-*decoders 
applied to all fc-subblocks of the output (fc+l)-blocks, followed by final 1-decoders (where the output syn- 
dromes of the fc-*decoders are discarded). As in the above proof of Rec-Cor, we can sweep the fc-*decoders 
to the left through the good (fc+l)-Rec, transforming it to a good 1-Rec, followed by final 1-decoders. 

Using Rec-SemiCor at level 1, we can move the 1-decoders to the left past the 1-Ga, transforming it to 
an ideal O-Ga. Now the fc-*decoders are in front of the 1-EC and the 1-decoders are behind it. Again using 
Rec-Cor at level fc, we can move the fc-*decoders back to the right, placing them behind the 1-ECs, while 
replacing the 1-ECs with the original fc-ECs. This puts the fc-*decoders right in front of the final 1-decoders, 
reassembling the full (fc+l)-decoders. Thus we have proven Rec-SemiCor for the (fc+l)-Rec. 

□ 

Proof of Rec-Val: We test the validity of the output from the (fc+l)-EC by applying the (fc+l)-*decoder, 
realized as fc-*decoders acting on all fc-subblocks, followed by a final l-*decoder. The output is valid if, with 
probability one, the final level- 1 syndrome measurement records no more than one error. 

As in the above proof of Rec-Cor, we sweep the fc-*decoders to the left through the good (fc+l)-Rec, 
transforming it to a good 1-Rec, followed by a final l-*decoder. Using Rec-Val at level 1, we see that the 
final 1-block is valid, no matter what state emerges from the fc-*decoders. Thus, the l-*decoder records no 
more than one error, and we conclude that the output of the (fc+l)-Rec is valid. 

□ 

Straightforward modifications of this argument show that the (fc+l)-preparation and (fc+l)-measurement 
satisfy Rec-Cor, completing the proof of: 

Theorem 5. Quantum accuracy threshold for independent stochastic noise (alternative ver- 
sion). Suppose that fault-tolerant gadgets can be constructed such that all 1-Recs obey the properties Rec-Cor, 
Rec-Val, and Rec-SemiCor. Suppose that I is the maximal number of locations in a 1-Rec, d is the maximal 
depth of a 1-Rec, and e^ 1 is the maximal number of pairs of locations in a 1-Rec. Suppose that independent 
stochastic faults occur with probability e < eo at each location in a noisy quantum circuit. Then for any fixed 
S, any ideal circuit with L locations and depth D can be simulated with error 5 or better by a noisy circuit 
with L* locations and depth D* , where 

L* = O (L(logL) log2£ ) , D* = O (£>(logL) log2 d ) . (52) 
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As for our previous version of the threshold theorem, we can improve the threshold estimate by counting 
malignant pairs of locations in the 1-Recs, where a pair of locations is benign if the 1-Rec satisfies the 
properties Rec-Cor, Rec-Val, and Rec-SemiCor for arbitrary faults at that pair of locations. We can also 
extend the theorem to concatenation of codes that correct t > 3 errors. 

Thus for concatenation of a code of distance 2t + 1 > 5, we can use either Theorem 4 or (the extension of) 
Theorem 5 to estimate the threshold. Typically, Theorem 4 yields a more favorable estimate. For example, 
in the distance-5 case, we may either apply Theorem 4 to exRecs or Theorem 5 to Recs. But at level 1, 
exRec-Cor is more likely to hold than Rec-Cor for a given fault rate; at least three faults in a 1-exRec are 
needed for exRec-Cor to fail, while two faults in a 1-Rec can cause Rec-Cor to fail. 

11 Local non-Markovian noise 
11.1 A more general noise model 

We have so far restricted our attention to a rather special noise model. We have assumed that faults are 
uncorrelated in space and in time, and we have assumed that each fault leaves a permanent record in the 
environment, attesting that the fault occured. 

Let us refer to a particular history, indicating all the locations in a circuit where faults occur, as a fault 
path. If the environment records the fault path — that is, if different fault paths are labeled by perfectly 
distinguishable states of the environment — then distinct fault paths decohere with one another. Therefore, 
we may assign a probability to each fault path. In our proof of the threshold theorem, we derived an upper 
bound on the sum of the probabilities of all the bad fault paths that contribute to an extended rectangle. 

By adopting this model of independent stochastic noise, we were able to explain the key ideas underlying 
the proof of the threshold theorem in an especially simple setting. But the theorem we have proved so far 
has limited applicability. For example, it does not apply to the case of unitary errors in which each gate is 
a unitary transformation that deviates from the ideal gate by a slight over-rotation or under-rotation. Such 
errors leave no record. Furthermore, our threshold theorem does not apply if the faults are weakly correlated 
in space and/or in time. Such correlations are expected in realistic implementations. 

We will now broaden the setting, and extend our proof of the threshold theorem to a more general noise 
model, which we will call local noise. The defining characteristic of local noise is that the noise does not 
act collectively on many data qubits at once; rather the noise acts either on individual data qubits, or acts 
collectively on a set of data qubits that interact during the execution of a quantum gate. But the different 
noisy gates act on an environment or "bath" as well as on the data, and we allow the various noisy gates to 
share the same bath. In effect, the bath serves as a memory that can store information for an indefinitely long 
time — the noise is non-Markovian. Although the noise acts locally on the data, the information recorded 
by the bath allows the faults to be correlated, both in time and in space. We will see that these correlations 
do not prevent us from proving a threshold theorem, provided the noise is sufficiently weak. 

A threshold theorem for non-Markovian noise has also been formulated in an insightful paper by Tcrhal 
and Burkard [7]. Our analysis differs from theirs, and has two distinct advantages. First, for the proof in 
[7] to apply, a locality condition must be imposed not only on the interaction of the bath with the data, 
but also on interactions among the degrees of freedom within the bath. In contrast, we will not need any 
assumption about the locality of the bath. Second, the proof in [7] applies when the fault-tolerant gadgets are 
nonoverlapping, but has no obvious extension to the case where the gadgets overlap. Therefore, it applies 
to concatenated distance-5 codes but not to concatenated distance-3 codes. In contrast, our approach is 
applicable even for the overlapping gadgets that arise in the analysis of concatenated distance-3 codes. 

From a physics perspective, a non-Markovian noise model is most naturally formulated in terms of a 
Hamiltonian H that governs the joint evolution of the "system" (the data qubits that are processed by the 
computation) and the "bath" (the environment whose interactions with the system drive the noise). We 
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may express H as 

H = H S + H B +H SB , (53) 

where H$ is the time-dependent Hamiltonian of the system that realizes the ideal circuit, Hb is an arbitrary 
Hamiltonian of the bath, and the Hamiltonian Hsb coupling the system to the bath is 

H SB =J2 H SB,a; (54) 

a 

here each H$B,a describes how a particular set of qubits a, which are acted upon by a particular gate in 
the ideal circuit, interact with the bath. The system-bath interaction Hsb describes the noise, which is 
"local" in the sense that two (or more) data qubits are coupled by Hsb in a given time step only if the ideal 
Hamiltonian Hs also couples those data qubits in the same time step. 

We can express the time evolution governed by the Hamiltonian H in terms of a "time-resolved fault 
path" expansion [7]. That is, we divide the total time T into N time intervals each of width A = T/N, where 
TV is sufficiently large that terms of order A 2 can be safely neglected. Then the time evolution operator for 
the time interval (t, t + A) can be accurately expressed as 

U(t + A,t)^e- lAHs e-* AH - Y[(IsB-iAH SB ,a) , (55) 

a 

and the time evolution operator U(T, 0) for interval (0, T) is the product of N such operators. For each 
summand in the fault path expansion, at each time resolved location labeled by t and a, we insert either 
I SB (in which case we say there is no fault at that location) or —iAHsB,a (in which case we say there is a 
fault at that location). 

Suppose that the time required to execute each gate in the quantum circuit is to, so that associated with 
each gate is a "coarse-grained" location consisting of to/ A consecutive time-resolved locations. We say that 
a coarse-grained location is faulty if there is a fault at any one of the time-resolved locations that it contains. 
If the coarse-grained location is not faulty, then the noise is trivial there and the ideal gate is executed 
faithfully. 

We will use the sup operator norm to characterize the strength of the noise — the norm of an operator 
A is denoted || A \\ and defined by 

|| A ||= sup w (\\A\i;)\\ /|| VII) ; (56) 

this norm has the properties 

\\A + B\\ < || A || + || B || , || AB || < || A \\ ■ \\ B || , \\A®B\\ = \\ A \\ ■ \\ B || . (57) 

Suppose that the system-bath coupling at each time-resolved location satisfies 

II H SB ,a || < Ac . (58) 

Specify r particular coarse-grained locations in a quantum circuit, and let E denote the sum over all time- 
resolved fault paths with faults at those r locations, where the time-resolved fault path is completely unre- 
stricted outside of the r coarse-grained locations. Then, as shown in [7], || E || < (A to) r - This observation 
motivates the following definition: 

Definition. Local Noise. Consider a noisy quantum circuit, realized as a unitary transformation Usb 
acting on a system S and a bath B, with Usb expressed as a fault-path expansion where each term is 
characterized by a set of fault locations. LetT r denote a particular set of r (coarse-grained) locations, and 
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let E(X r ) denote the sum over all fault paths that contain faults at all of the locations in T r . Then we say 
that the noise is local with strength r\ if for allX r 

II E{T r ) || < rf . (59) 



By this definition, our Hamiltonian noise model is local, with noise strength n = Xotg, for an arbitrary bath 
Hamiltonian Hb ■ Note that we are free to shift H$ b by a constant in order to optimize r\ [7] . 

Of course, it is also possible to formulate local noise models without direct reference to a Hamiltonian. 
Instead, we could characterize the noisy gates as unitary operators associated with circuit locations, where 
we allow the different noisy gates to act on a shared bath (and in fact arbitrary unitary transformations 
may be applied to the bath in between the successive quantum gates). To be specific, we will describe such 
a noise model first of all for a singlc-qubit gate. If the ideal gate applies the unitary transformation [/ideal to 
the qubit, the actual noisy gate instead applies a unitary transformation U = [/f au it [/ideal to the qubit and 
the bath, where 

C/fauit = (/ ® A + X <g> A x + Y <g> A 2 + Z <g> A 3 ) . (60) 

Here {/, X, Y, Z} are the Pauli operators, and the operators {A , Ax, A 2 , A 3 } are arbitrary operators acting 
on the state of the bath (subject to the constraint that [/f au it is unitary). Then we may say that the noise 
has strength rj if each noisy gate in the circuit obeys 

|| X® Ax + Y ®A 2 + Z® A 3 || < r] . (61) 

A measurement can be modeled as C/f au it followed by an ideal measurement, and a preparation can be 
modeled as an ideal preparation followed by [/fault- Similarly, a noisy two-qubit gate can be expressed as 
the ideal gate followed by an operator that can be expressed as an expansion in Pauli operators; the noise 
strength r\ is an upper bound on the norm of the sum of the operators in this expansion, excluding the leading 
operator I ® I ® A on . Note that, because gates that act in the same time step may have noncommuting 
actions on the bath, an operator ordering convention is needed to unambiguously specify the model, but the 
noise strength does not depend on how the ordering ambiguity is resolved. 

If each noisy gate acts on an independent bath, and the operators I®A and (X®Ax+Y®A 2 +Z®A 3 ) map 
the environment to mutually orthogonal states, then our local noise model describes independent stochastic 
faults. In that case the probability of a fault is not 77 but rather e = rf 1 . We may regard 77 as the amplitude 
weighting an error, whose square is a probability. If A$, Ax, A 2 , and A 3 are all scalar multiples of one 
another, then our local noise model describes unitary errors. For example, if [/fault acts only on a single data 
qubit and has eigenvalues e ±l0 ^ 2 , then r\ = | sin 6/2 \ . 

Our proof of the threshold theorem for independent stochastic noise already applies to a restricted type 
of non-Markovian noise, in which distinct fault paths do not interfere. That proof works as long as each fault 
path with r faulty locations can be assigned a probability no larger than e r ; once a particular fault path is 
selected, we may allow the faulty locations to share a bath. We emphasized this point in the discussion of 
the inductive step in Sec. 5.2.1, where we noted that a level- (fc+1) circuit built out of gates with independent 
stochastic faults can be regarded as a simulation of a level- 1 circuit where the bad fc-exRecs simulate faults 
that share a bath. The new analysis we will now describe is needed not just because we wish to consider 
faults that share a bath, but also because of the potential for quantum interference among the fault paths. 

11.2 Bounding the norm of the sum over bad fault paths 

The proof of the threshold theorem for local noise has two parts. First we will show that if the noise strength 
77 is sufficiently small, then the norm of the sum over bad fault paths becomes doubly exponentially small 
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as the level k of the simulation increases. Then in Sec. 11.4 we will show that this small norm implies that 
the noisy circuit accurately simulates the ideal computation. 

The proof works whether or not overlapping rectangles are invoked in the definition of goodness. We 
will present the proof for nonoverlapping rectangles, and then elaborate on its applicability to overlapping 
rectangles in Sec. 11.3. The result can be extended to the case where ancillas are rejected whenever a 
verification test fails, but we will ignore that complication in our proof. 

To begin, consider a noisy circuit with a total of A level-0 locations, and let F s denote the sum over all 
fault paths with s or more faults. Can we find an upper bound on || F s ||? We would like to express F s as a 
sum of terms of the form E(T r ), since then we can bound the norm of the sum by the sum of the norms, and, 
using the definition of local noise, bound each norm in the sum by a power of r\. But it is not completely 
straightforward to relate F s to the E(l r )'s. In particular, if we sum E(T S ) over all sets of s locations, each 
fault path with more than s locations will be counted multiple times. Let X denote the set of all locations 
in a circuit (with \1\ = A), and let E r denote 

E r =Y, £( J r) , (62) 

I r CI 

the sum of the E(X r )'s over all sets of r locations in the circuit. The crux of our analysis is this combinatoric 
lemma: 

Lemma 7. Counting of fault paths. 

111 /r-l\ 
F s = J2(- i y~ S L_ 1 ) E r- (63) 



Proof: Eq. (63) can be derived from the inclusion-exclusion principle (see for example [36]). For complete- 
ness, we provide a self-contained discussion of how the result can be obtained. To count correctly all the 
fault paths with s or more faults, we first sum over all ways of choosing s locations, and for each of these 
choices, we sum over all fault paths that are bad at those s locations. But then we have overcounted the fault 
paths that are bad at (at least) s + 1 locations, so we make a subtraction to correct for the overcounting. 
But now, because of this subtraction, we have undercounted the fault paths that are bad at (at least) s + 2 
locations, so we need to make an addition to correct for the undercounting, and so on. 

To determine the combinatoric factors in this expansion, we first choose an arbitrary ordering of the \T\ 
circuit locations. This ordering is used just to facilitate the counting, and need not bear any relation to 
the time ordering in the circuit; nevertheless we will use temporal language to describe the ordering — e.g., 
speaking of an "earlier" location or a "later" location. At each circuit location, we divide the sum over fault 
paths into a good part Gj and a bad part Bi. (For example, in the Hamiltonian model the good part includes 
only the case where Isb is inserted at every time- resolved location contained in the coarse-grained location i, 
and the bad part includes every case in which —iAHsB,a is inserted at at least one time-resolved location.) 
F s can be expressed as a sum of terms, where in each term the s earliest fault locations are identified: Bi is 
inserted at each of these s locations, Gi is inserted at all other locations prior to the last of the s earliest bad 
locations, and Gi + Bi is inserted at each location after the last of the s earliest bad locations. With this 
scheme, every fault path with s or more locations has been included exactly once. We refer to this expansion 
of F s as the "original" expansion; from it we are to obtain the "derived" expansion eq. (63). 

Next we rewrite each Gi in each term of the original expansion as 

Gi = {d + Bi) - Bi (64) 

and expand in powers of Bi to obtain the derived expansion. The resulting term with Bi at each of a set 
of locations I r , and Gi + Bi elsewhere, is E{X r ). In how many ways can this term arise in the derived 
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expansion? Of the r B^s, s are "primary" B^s that are already present in the original expansion, and the 
rest are "secondary" B^s that arose from expanding d = (Gi + Bi) — Bi. The Bi at the latest bad location 
in the circuit must be a primary Bi, but each of the other r — 1 could be either one of the remaining s — 1 
primary Bi's or a secondary Bi. There are ( r s Zi) ways to choose which of the Bi other than the last one are 
primary, and the number of minus signs is the number of secondary Bi's; thus we find eq. (63). 



□ 



Using Lemma 7 and the definition of local noise, we obtain an upper bound on the norm of F s : 



< 



Now consider a level- 1 simulation, in which each gate of an ideal circuit is replaced by the corresponding 
1-Rec, and let l^p denote a particular set of r 1-locations (locations of 1-Rccs) in the 1-simulation. Let us 
say that a 1-Rec is bad if it contains s or more faults, and consider E(T^), the sum over all fault paths in 
the 1-simulation such that the 1-locations in are bad. 

Now we can carry out the "original" expansion in each of the 1-Recs, and obtain a "derived" expansion 
by expanding d = (Gi + Bi) — Bi at each good level-0 location. Label the r bad 1-locations by the index 
G {1,2,..., r}, let 1(b) denote the set of all 0-locations in bad 1-Rec b, and let 1(b) e denote a particular 
set of I locations in bad 1-Rec b. Summing independently over the sets of bad locations inside each of the r 
bad 1-Recs, and reasoning as in the derivation of Lemma 7, we find 

11(1)1 ft W |I(r)l fP i\ 

Em - e (-^Ts-i) ••• £ (-^n t-i ) 

E ••• E E(I(l) ei U---Ul(r)e r ). (66) 

X(l) fl CI(l) I(r) (r CI(r) 



For local noise, we have 



r 

£?(/(!)/, U---U/(r) /p ) || < 1,(^=^0 = Y[rf> . (67) 



6=1 

Therefore, our bound on E(l^) factorizes into r factors, each of the form in eq. (65), and so we obtain: 



Lemma 8. Norm of the sum over level-1 fault paths. Define a 1-Rec to be bad if it contains s or 
more faults, and let E(li X) ) denote the sum over all fault paths in a 1-simulation such that the r 1-locations 
in the set 1^ are bad. Then for local noise with strength r/ 

II EQ?>) || < (^)) r , (68) 

where 

r,V=c( A \ s , (69) 



s / 

C > e^ A ~ s ^ 71 , and A is the maximal number of locations inside any 1-Rec. 

Note that eq. (68) has just the same form as eq. (59), so we can use the same reasoning as in the 
derivation of Lemma 8 to take the inductive step. Now suppose that a fc-Rec is bad if it contains s or more 
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bad (fc— 1)-Recs, let denote a particular set of r fc-locations in a fc-simulation, and let E{xl k " 1 ) denote the 
sum over all fault paths such that the r fc-locations in X^ are bad. Suppose that || E(X {k) ) ||< (?? (fe) ) r ; then 
we may argue that || E(X^ k+1) ) \\ < (v^+^Y , where r/( fc+1 ) = C( A ) (r/ fe )) s and C > e^" 8 )". Therefore we 
have proved 

Lemma 9. Norm of the sum over bad level-fc fault paths. Define a k-Rec to be bad if it contains s 
or more bad (k~l)-Recs, and let E{xl k ^) denote the sum over all fault paths in a k-simulation such that the 
r k-locations in the set 2,- fe ' are bad. Then for local noise with strength r\ 

\\E{Tf))\\ < , (70) 

where 

V (k) = Vo (v/vof , (71) 

*- - «)) ■ < 72 » 

V < Vo, C > e( A ~ s ) m , and A is the maximal number of locations inside any 1-Rec. 

We can optimize our estimate of 770 by setting C — e^ A ~ s ^ r, ° and solving eq. (72) for 770, or more simply we 
may choose C = e, since the inequality (A — s) s_1 < r/ ^ s ^ = e( A ) is satisfied for typical level-1 simulations 
of interest (e.g., for s = 2). 

11.3 Overlapping rectangles 

Now let us note that Lemma 9 applies even when overlapping rectangles are used to define the badness of 
fault paths. At first, the overlaps may seem to complicate the discussion. We again wish to organize the 
sum over all the fault paths such that a particular fc-location is bad into a sum of terms, where in each 
term a specified set of r (k— l)-locations contained inside the fc-location are bad. But now, depending on 
the context, a "location" can be either an exRec (if the locations immediately following are all good), or a 
truncated exRec (if a location immediately following is bad). 

Nevertheless, for any specified fault path, whether the location is truncated or not, and whether it is bad 
or not, is unambiguously determined, and in order for the location to be bad, it must contain at least s bad 
locations at the next level down. For a fc-simulation, we can, just as before, express the sum over all fault 
paths with r or more bad fc-exRecs as an inclusion-exclusion sum of the form in eq. (63). The combinatorics, 
and therefore the coefficients in the sum, are the same as for the case of nonoverlapping rectangles. The 
only change is that for each fault path appearing in the sum, some exRecs are truncated and some are 
untruncated, and which ones are truncated varies from fault path to fault path. 

A truncated fc-exRec has fewer locations than the full fc-exRec. Therefore, at each level k, the upper 
bound eq. (70) holds whether or not some of the r bad fc-exRecs are truncated, and the inductive proof of 
Lemma 9 goes through, where now A is the maximal number of locations inside any 1-exRec. 

11.4 Accuracy 

Now we have seen that if 77 < 770, the norm of the bad part of a fc-simulation declines double-exponentially 
with the level k. What does this imply about the accuracy of the simulation? 

The sum over all fault paths for a level-fc simulation of a circuit with L locations can be divided into a 

s-i(k) (k) 

good part Circuit and a bad part -B^ircuit' wnere in the good part every fc-exRec is good. We can bound the 



norm of the bad part by combining eq. (65) with Lemma 9, finding 

II ^f r } cuit || < exp ((L - l)r/ fe >) < eLn^ , (73) 

where to obtain the last inequality we have assumed (L — l)r/ fe ) < 1. 
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Now, the good part of the circuit is correct. Strictly speaking, we have proved this only when all of the 

0- faults are physical operations. But we can always express the sum over the good fault paths such that in 
each term (except for an overall normalization) every 0-fault is a Pauli operator (which is physical) and then 
invoke linearity to conclude that the sum is correct because all of its terms are correct. 

The final outcome of the level- fc simulation is determined by measuring a quantum state. This state 
might be mixed but by considering its purification, we may imagine that the measured state is a pure state 
of the data and the environment (including ancillas), which can be expressed as a sum of good and bad parts 
as 

l^)=^f r ) cu itl^o)+ J BW cuit l^o) • (74) 

Here \ipo) is the ideal input to the circuit — we may move any faults in the input state and any faults in 
the final measurement into the circuit, so we may imagine that the input and the final readout are ideal 
(recall that we are assuming that the classical decoding of the measurement outcomes is reliable). Up to 
normalization, the good part of the state, when measured, produces the same probability distribution of 

(k) 

outcomes as the ideal circuit. Therefore, our bound on the norm of the £> -r C uit allows us to bound the 
accuracy of the simulation. 

Consider a POVM with elements {E a } and let p and p be arbitrary density operators. Let p denote the 
probability distribution for the outcomes when the POVM is performed on the state p and let p denote the 
probability distribution for the outcomes when the POVM is performed on the state p. Then the L 1 distance 
between the probability distributions is 

5=\\p-p\\=Y,\Pa-Pa\=Y^\ trE ^P-p)\ <El A 'H<*|£aN>l=ll/>-/5||tr • (75) 
a a a.i 

Here {\i}} are the eigenstates of p— p, {Aj} are the corresponding eigenvalues, || • || tr denotes the trace norm, 
and in the last step we have used the completeness relation ^E a = I. If both states are pure, p = \ip)(ip\, 
and p = IV'XV'I, triCn the two nonzero eigenvalues of p — p are ±sin#, where = cos 9, and therefore 

||p-p||tr=2|sin0|. 

To bound the accuracy of the level-A: simulation, then, we need to estimate the angle 9 between \ip) given 
by eq. (74) and the (properly normalized) good part of this state. With the norm of the bad part of the 
state fixed, the angle is maximal when the good and bad parts are orthogonal, in which case | sm9\ is the 
norm of the bad part. We conclude that the error of the simulation satisfies 

S = \\p-p\\ < 2 115^11 < 2eLn^ < 2eL Vo (v/vo/ • (76) 

This is essentially the same scaling of the accuracy with the level k as found in eq. (10), except that now the 
noise strength n is a fault amplitude, while for independent stochastic errors, the fault probability e = rj 2 
appeared instead. 

Arguing as in Sec. 4, we then obtain 

Theorem 6. Quantum accuracy threshold for local non-Markovian noise. Suppose that fault- 
tolerant gadgets can be constructed such that all 1-exRecs obey the property exRec-Cor, where a 1-exRec that 
contains no more than 1 fault is said to be good. Suppose that £ is the maximal number of locations in 
a 1-Rec, d is the maximal depth of a 1-Rec, and rjQ 1 /e is the maximal number of pairs of locations in a 

1- exRec. Suppose that local noise occurs with strength n < t] at each location in a noisy quantum circuit. 
Then for any fixed S, any ideal circuit with L locations and depth D can be simulated with error S or better 
by a noisy circuit with L* locations and depth D* , where 

L* = O (L(log£) log2 £ ) , D* =0(D {log L) lo ^ d ) . (77) 
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The theorem specifies the same scaling of the overhead as the theorem for independent stochastic noise, and 
the value of the accuracy threshold is similar. The important difference is that for local noise the threshold 
condition must be satisfied by an error amplitude rather than an error probability. Of course, the theorem 
can be extended to the case where a 1-exRec that contains no more than t faults is said to be good. 

Because distinct quantum gates are permitted to act on a shared bath, our local noise model incorporates 
correlations among faults, both in space and in time. The key to deriving an accuracy threshold is not strictly 
speaking the suppression of such correlations; rather it suffices for the coupling of the data qubits to the 
bath to be sufficiently weak. Though the noise acts locally on the data, we emphasize again that we did not 
need to impose any locality condition on the Hamiltonian of the bath; Hb is completely arbitrary. 

11.5 Extension to counting malignant sets 

We can improve the estimate of the threshold by counting malignant sets of locations as in Sec. 6. Though 
the analysis also applies to overlapping rectangles, we will consider the case of nonoverlapping rectangles 
here, and speak of Recs rather than exRecs. 

For example, suppose that there are B malignant pairs of locations in a 1-Rec. Then the f-Rec can be 
bad if faults occur at a malignant pair of locations, or if faults occur at any set of three or more locations in 
the 1-Rec (where it may be that no two of these locations form a malignant pair). Again, we wish to bound 
the norm of the sum of all bad fault paths. 

The sum of all fault paths with three or more faults in the 1-Rec can be expressed as in Lemma 7 (with 
s = 3). We need to add to this an additional contribution due to the fault paths with exactly two faults 
that form a malignant pair. For each malignant pair we place B at each of the locations in the malignant 
pair, and G at all other locations; then we expand each G = (G + B) — B. In this expansion, how many 
times does the term occur with B inserted at each of the locations in a particular set I r of r locations? Any 
two of these r locations could in principle be the "primary" locations, but only if these two locations form a 
malignant pair. Therefore the term occurs at most (£) times (if every pair of the r locations is a malignant 
pair) and no fewer than zero times (if no pair of the r locations is a malignant pair). On the other hand, 
each set of r locations occurs (f^, 1 ) times in the expansion arising from all sets of three or more locations, 
and furthermore, these terms are opposite in sign to the terms that arise from the malignant pairs. When 
we combine the two expansions, the absolute value of the coefficient for a term with r bad locations is at 
most the larger of { r ^ 2 1 ) and (?,) — ( r ~ x ) = r — 1. Invoking Lemma 7, and reasoning as in the proof of Lemma 
8, we therefore have: 

Lemma 10. Norm of the sum over level-1 fault paths with malignant pairs. Define a 1-Rec to be 
bad if it contains faults at a malignant pair of locations, or at some set of three or more locations. Suppose 
that there are no more than B malignant pairs of locations in any 1-Rec, and at most A locations in any 
1-Rec. Let E{1^) denote the sum over all fault paths in a 1-simulation such that the r 1-locations in the 
set 1^ are bad. Then for local noise with strength r\ 

II E(l^) || < (^)) r , (78) 

where 

r ) W=Br? + {C+l)(^r } \ (79) 

andC> e ( A - 3 )". 

(The coefficient C multiplying (^)?? 3 arises from estimating the sum over r as in eq. (65); the +1 arises 
from the r = 3 term, because max(r — 1, )) = r — 1 = 2 for r = 3.) A similar estimate applies at 
higher levels, so we can obtain an improved value for the accuracy threshold by counting malignant pairs. 
Furthermore, the argument also works for overlapping rectangles. 
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For the independent stochastic noise model, we estimated the threshold fault rate £o by counting ma- 
lignant pairs of faults, as reported in the proof of Theorem 3. It is not quite straightforward to adapt that 
analysis to obtain an estimate of the threshold noise strength rjQ for the local noise model. The circuits stud- 
ied in Sec. 8 include ancilla verification steps, and the threshold analysis involves an estimate of the effective 
fault rate in the ancillas that pass the verification test. Similarly, to estimate 770 using the same circuits, we 
would need to bound the effective noise strength in postselected ancillas, and we have not performed this 
calculation. 

11.6 Decoherence of fault paths via syndrome measurement 

Our estimate of the quantum accuracy threshold for general local noise is much worse than for stochastic 
errors, because the accuracy threshold for local noise specifies a value of a fault amplitude (the noise strength) 
rather than a fault probability. Expressed as a critical fault probability, the threshold for general local noise 
is in effect the square of the threshold for stochastic noise. 

This conclusion seems overly pessimistic, for several reasons. The crucial difference between the stochastic 
model and the local model is that in the stochastic model, each fault path in an exRec is associated with a 
perfectly distinguishable state of the environment, and therefore the fault paths decohere. In contrast, for 
local noise the states of the environment associated with distinct fault paths need not be distinguishable, 
and therefore the bad fault paths might interfere constructively. Thus we obtain a much more demanding 
condition on the noise. 

In practice, though, it would seem to require a highly implausible conspiracy for many bad fault paths 
to add together with a common phase. Under a more plausible scenario in which the phases of distinct 
bad fault paths are only weakly correlated, we expect amplitudes to accumulate more like a random walk 
in the complex plane — that is, it is more reasonable to add the probabilities of the fault paths than their 
amplitudes. 

However comforting in practice, this observation is of little help if we aspire to prove a rigorous theorem, 
since we are obligated to consider the worst possible case that is compatible with the assumptions of our 
noise model. But there is another observation that is more likely to point the way to a stronger rigorous 
result than found in Sec. 11.4. The "environment" includes not just the system needed to provide a dilation 
of our noise model, but also the ancillas that are used in the measurement of error syndromes. Therefore, 
for two different bad fault paths of a fc-exRec to interfere, they must have identical error syndromes. 

We might, then, find an improved threshold estimate for local noise by carrying out the following 
stratagem. For the 1-exRec, suppose that we can divide the bad fault paths into TV classes, where fault 
paths in distinct classes arc unable to interfere with one another. Suppose that for each class, the norm of 
the sum of the contributions to B^ arising from that class can be expressed as a sum of no more than M 
terms, each with norm no larger than erj 2 . Since the norms of the decoherent contributions to B^ add in 
quadrature, we then find that 

|| || 2 < N(Me V 2 ) 2 ; (80) 
the condition || B^ \\<\\ B^ || then implies a value of the threshold noise strength 770 where 

r,^ 1 = eMVN . (81) 

At least roughly, this threshold estimate interpolates between our result for general local noise (N = 1, 
M = A), and our result for independent stochastic noise N = A, M = 1). 

For the seven-qubit Steane code analyzed in Sec. 8, there are 2 6 = 64 possible values for the syndrome, 
and the CNOT exRec, which dominates the threshold estimate for the case of independent stochastic er- 
rors, contains four syndrome measurements. Deriving nontrivial bounds on M and N, which we have not 
attempted, might lead to a substantial improvement in the accuracy threshold for local noise. 
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12 Conclusions 

We have proven six versions of the quantum accuracy threshold theorem, all based on recursive simulations. 
Theorem 1 assumes an independent stochastic noise model, and establishes a threshold for a fault-tolerant 
simulation based on the concatenation of a distance-3 quantum code. Theorem 2 is a refinement of Theorem 
1 that improves the estimate of the accuracy threshold. Theorem 3 derives a lower bound on the accuracy 
threshold, by applying Theorem 2 to Steane's [[7,1,3]] code, and Theorem 4 extends Theorem 1 to concatena- 
tion of codes that correct two or more errors. Theorem 6 generalizes the noise model considered in Theorem 
1 to local non-Markovian noise. 

In the proofs of Theorems 1-4, we assess the accuracy of a recursive simulation by characterizing how 
faults are distributed in extended rectangles that overlap with one another. Theorem 5 establishes an accuracy 
threshold using a different approach in which accuracy is assessed by studying how faults arc distributed in 
rectangles that do not overlap. Theorem 6 is applicable to both approaches. 

Other versions of the quantum threshold theorem have been discussed previously [2, 3, 4], but ours is 
the first rigorous proof that applies to the concatenation of a code that can correct only one error, and it 
provides the best lower bound on the accuracy threshold that has been rigorously established up to now. 
The proof uses novel ideas that we expect will have further applications in future studies of fault-tolerant 
quantum simulations. The crux of the proof of Theorems 1-4 is the inductive step based on the threshold 
dance depicted in Fig. 5, which shows that if a level-fc rectangle is contained in a good extended rectangle, 
then the rectangle followed by an ideal decoder is equivalent to an ideal decoder followed by an ideal gate. 

Theorem 5 is really a reformulation and elaboration of the proof in [2] that, we hope, brings the essential 
elements of that argument into sharper focus. Theorem 6 shows that an accuracy threshold based on 
concatenated distance-3 codes can be established for noise models that allow faults to be spatially and 
temporally correlated, even if the environment can store quantum information for an indefinitely long time. 
The proof uses a different method, and applies under weaker assumptions, than the previous analysis of 
non-Markovian noise in [7] . 

In all proofs of the threshold theorem, including ours, certain assumptions are essential. The noise 
strength must be bounded above by a sufficiently small constant that is independent of the size of the 
computation. To flush from the computer the entropy arising from noise, we must have an inexhaustible 
supply of fresh ancilla qubits (or the ability to refresh and reuse our ancilla qubits an indefinite number of 
times) [18]. To control memory noise that afflicts all parts of the computer simultaneously, we must be able 
to execute quantum gates in parallel. To combat errors effectively, we must assume a noise model in which 
faults that act collectively on many qubits are highly suppressed. 

For our explicit gadget constructions in Sec. 8, and in the formulation of Theorems 1-6, we have made 
additional assumptions that are not necessary for the existence of an accuracy threshold, but are helpful 
for simplifying the analysis and improving the numerical value of the threshold. We have assumed that 
a qubit can be measured as quickly as a quantum gate can be executed, and we have assumed that the 
classical processing of measurement outcomes can be done instantaneously and flawlessly. We have assumed 
that two-qubit quantum gates can be executed with a fixed accuracy on any pair of qubits, irrespective of 
their spatial proximity. And we have assumed a noise model that excludes "leakage errors" in which qubits 
become inaccessible and must be replaced. It would be useful to repeat the threshold calculation with some 
or all of these assumptions relaxed. 

Under these same assumptions, it is also possible to derive a quantum accuracy threshold using two- 
dimensional topological codes [37, 38]. The lower bound on the threshold for quantum memory found in 
[39], based on topological codes, is comparable to the memory threshold that can be established for the 
concatenated [[7,1,3]] code using the method in Sec. 8. However, an optimized estimate using topological 
codes of the computation threshold has not been attempted, as the purification protocol for the C3 ancilla is 
rather complicated in this case [40]. It has also been shown [41] that an especially favorable overhead scaling 
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can be established for fault-tolerant quantum computing using higher-dimensional topological codes. 

Our detailed threshold estimate in Sec. 8 was carried out for the [[7,1,3]] code. It would be interesting to 
analyze other codes that might conceivably yield a higher threshold, such as the [[23,1,7]] Golay code [12], 
or polynomial codes [42]. It might also be useful to improve the threshold result for general local noise, by 
pursuing the program sketched in Sec. 11.6. 

Another looming challenge is to put Knill's recent threshold estimates [13] on a rigorous footing. Knill 
uses a more complex simulation method, in which concatenated error- detecting codes are used to prepare 
certain encoded ancilla states. Based on numerical studies, he reports a threshold value three orders of 
magnitude higher (!) than the value established by our Theorem 3. We hope that the methods we have 
introduced in this paper can be adapted to the analysis of Knill's scheme or related schemes, leading to 
tighter rigorous lower bounds on the quantum accuracy threshold. 
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